FEED Validator

for Atom and RSS and KML

Congratulations!

This is a valid RSS feed.

Recommendations

This feed is valid, but interoperability with the widest range of feed readers could be improved by implementing the following recommendations.

line 1, column 38: Use of unknown namespace: http://webfeeds.org/rss/1.0 [help]

<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:atom="http:// ...
                                      ^

line 1, column 316: Image not in required format [help]

... rl>https://blog.mastykarz.nl/favicon.ico</url><title>Waldek Mastykarz</t ...
                                             ^

line 1, column 492: Self reference doesn't match document location [help]

...  rel="self" type="application/rss+xml"/><pubDate>Thu, 19 Jun 2025 10:39: ...
                                             ^

line 3, column 0: Non-html tag: picture (19 occurrences) [help]
```
&lt;p&gt;&lt;picture&gt;
```
line 22, column 0: description should not contain relative URL references: /language-model-benchmarks-story/ [help]
```
</description><pubDate>Thu, 19 Jun 2025 10:39:00 GMT</pubDate></item><item>< ...
```

line 176, column 43: description contains bad characters [help]

  &amp;quot;feedback&amp;quot;: &amp;quot;ð\x9f« &amp;quot;,
                                           ^

Source: http://feeds.feedburner.com/WaldekMastykarz

<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:webfeeds="http://webfeeds.org/rss/1.0"><channel><title>Waldek Mastykarz</title><description>Innovation matters</description><link>https://blog.mastykarz.nl/</link><image><url>https://blog.mastykarz.nl/favicon.ico</url><title>Waldek Mastykarz</title><link>https://blog.mastykarz.nl/</link></image><atom:link href="https://blog.mastykarz.nl/feed.xml" rel="self" type="application/rss+xml"/><pubDate>Thu, 19 Jun 2025 10:39:00 GMT</pubDate><lastBuildDate>Thu, 19 Jun 2025 10:39:00 GMT</lastBuildDate><webfeeds:analytics id="UA-3652888-1" engine="GoogleAnalytics"/><ttl>60</ttl><item><title>Benchmark models using OpenAI-compatible APIs</title><link>https://blog.mastykarz.nl/benchmark-models-openai-compatible-apis/</link><guid isPermaLink="true">https://blog.mastykarz.nl/benchmark-models-openai-compatible-apis/</guid><description><p><img src="https://blog.mastykarz.nl/assets/images/2025/06/banner-language-model-results.png" alt="Benchmark models using OpenAI-compatible APIs" class="webfeedsFeaturedVisual" /></p><p>Recently, I wrote about <a href="/language-model-benchmarks-story/">why you should write your own benchmarks for language models</a> to see how they work for your app. I also shared a ready-to-use <a href="https://github.com/waldekmastykarz/ollama-compare">Jupyter Notebook that allows you to evaluate language models on Ollama</a>. I've just published a new version of the <a href="https://github.com/waldekmastykarz/openai-compare">notebook which now supports any language model host that exposes OpenAI-compatible APIs</a>.</p>
<p>Like the previous version, the new notebook shows you how well your select language models perform for each scenario, and overall.</p>
<p><picture>
<source srcset="https://blog.mastykarz.nl/assets/images/2025/06/language-model-test-oai-per-scenario.webp" type="image/webp">
<img src="https://blog.mastykarz.nl/assets/images/2025/06/language-model-test-oai-per-scenario.png" alt="Bar chart showing benchmark results for different language models across four tasks: getCalendarForUser, postCalendarForUser, getUser, and postCalendarForUser. The chart displays average combined scores on the y-axis from 0 to 1.0, with a horizontal line at 0.8 indicating the threshold. Six models are compared: Phi-3.5-mini-instruct-generic-gpu, Phi-4-mini-instruct-generic-gpu, mistralai-Mistral-7B-Instruct-v0-2-generic-gpu, qwen2.5-7b-instruct-generic-gpu, qwen2.5-1.5b-instruct-generic-gpu, and qwen2.5-0.5b-instruct-generic-gpu. Most models achieve perfect scores of 1.0 across tasks, with some exceptions showing lower scores particularly for the smallest qwen2.5-0.5b model. The chart title reads Dev Proxy v0.29.0 OpenAPI Operation ID and includes a legend on the right side identifying each model with different colored bars.">
</picture></p>
<p><picture>
<source srcset="https://blog.mastykarz.nl/assets/images/2025/06/language-model-test-oai-overall.webp" type="image/webp">
<img src="https://blog.mastykarz.nl/assets/images/2025/06/language-model-test-oai-overall.png" alt="Bar chart displaying benchmark results for six language models tested across four OpenAI API operations. The vertical axis shows Average Combined Score from 0 to 1.0, with a red horizontal threshold line at 0.8. The models tested are Phi-3.5-mini-instruct-generic-gpu achieving a perfect score of 1.0, Phi-4-mini-instruct-generic-gpu scoring 0.999, qwen2.5-7b-instruct-generic-gpu scoring 0.975, qwen2.5-1.5b-instruct-generic-gpu scoring 0.849, mistralai-Mistral-7B-Instruct-v0-2-generic-gpu scoring 0.763, and qwen2.5-0.5b-instruct-generic-gpu scoring 0.556. All bars are colored in blue against a white background with a grid. The chart title reads Dev Proxy v0.29.0 OpenAPI Operation ID. The visualization demonstrates that larger models generally perform better, with only the two smallest models falling below the 0.8 performance threshold. The professional appearance suggests this is from a technical benchmarking study or research publication.">
</picture></p>
<p>What's changed, is that you can now test models running not just on Ollama but on any host that exposes OpenAI-compatible APIs, be it <a href="https://learn.microsoft.com/azure/ai-foundry/foundry-local/what-is-foundry-local">Foundry Local</a>, <a href="https://ollama.com/">Ollama</a>, <a href="https://learn.microsoft.com/azure/ai-foundry/what-is-azure-ai-foundry">Azure AI Foundry</a>, or <a href="https://openai.com/">OpenAI</a>. When selecting the models to test, you can now specify the language model host, and an API key if you need one:</p>
<pre><code class="language-python"># Foundry Local
client = OpenAI(
api_key=&quot;local&quot;,
base_url=&quot;http://localhost:5272/v1&quot;
)
models = [&quot;qwen2.5-0.5b-instruct-generic-gpu&quot;, &quot;qwen2.5-1.5b-instruct-generic-gpu&quot;, &quot;qwen2.5-7b-instruct-generic-gpu&quot;, &quot;Phi-4-mini-instruct-generic-gpu&quot;, &quot;Phi-3.5-mini-instruct-generic-gpu&quot;, &quot;mistralai-Mistral-7B-Instruct-v0-2-generic-gpu&quot;]
</code></pre>
<p>By default, the notebook includes a configuration for Foundry Local and Ollama.</p>
<p>Another thing that I've changed is storing prompts in <a href="https://prompty.ai">Prompty</a> files. Using Prompty makes it convenient to separate the prompts from the notebook's code. It's also easy for you to quickly run a prompt and see the result of any changes you could make. I suggest you try it for yourself to see how easy it is.</p>
<p>If you're using language models in your app, you should be testing them to choose the one that's best for your scenario. <a href="https://github.com/waldekmastykarz/openai-compare">This Jupyter Notebook</a> helps you do that.</p>
</description><pubDate>Thu, 19 Jun 2025 10:39:00 GMT</pubDate></item><item><title>Language model benchmarks only tell half a story</title><link>https://blog.mastykarz.nl/language-model-benchmarks-story/</link><guid isPermaLink="true">https://blog.mastykarz.nl/language-model-benchmarks-story/</guid><description><p><img src="https://blog.mastykarz.nl/assets/images/2025/06/banner-language-model-results.png" alt="Language model benchmarks only tell half a story" class="webfeedsFeaturedVisual" /></p><p>When it comes to language models, we tend to look at benchmarks to decide which model is the best to use in our application. But benchmarks only tell half a story. Unless you're building an all-purpose chat application, what you should be actually looking at is how well a model works for your application.</p>
<p><em>This article is based on a benchmark I put together to evaluate which of the language models available on Ollama is best suited for Dev Proxy. You can find the <a href="https://github.com/waldekmastykarz/ollama-compare">working prototype</a> on GitHub. I'm working on a similar solution that supports OpenAI-compatible APIs and will publish it shortly.</em></p>
<h2>When the best isn't the best</h2>
<p>When we started looking at integrating local language models with <a href="https://aka.ms/devproxy">Dev Proxy</a>, to improve generating OpenAPI specs, we used Phi-3 on Ollama. It seemed like a reasonable choice: the model was pretty quick and capable, at least according to benchmarks. It turned out, that we couldn't have picked a worse model for our needs.</p>
<p><picture>
<source srcset="https://blog.mastykarz.nl/assets/images/2025/06/language-model-results-per-model.webp" type="image/webp">
<img src="https://blog.mastykarz.nl/assets/images/2025/06/language-model-results-per-model.png" alt="Bar chart showing average combined scores for four language models in optimized operation ID generation. From left to right: phi3.5 scored 0.967, llama3.2 scored 0.954, qwen2.5:3b scored 0.922, and phi3 scored 0.517. A red horizontal line at 0.8 indicates the performance threshold. The chart demonstrates phi3.5 and llama3.2 as top performers, while phi3 significantly underperforms compared to the others.">
</picture></p>
<h2>The problem with language model benchmarks</h2>
<p>We love benchmarks and comparisons. In the world of abundance, they offer an easy answer: which product is the best and which one should we choose. The thing is though, that a benchmark is based on <em>some</em> criteria. And unless you plan to use the product exactly based on these criteria, the benchmark is, well, useless. Sure, the results tell you something about the product, but it's hardly the right information for you to choose.</p>
<p>So rather than relying on standard benchmarks, when it comes to choosing the best model for your application, what you actually need is to tell how well a model works for <strong>your scenario</strong>. You need <strong>your own benchmark</strong>.</p>
<h2>Build your own language model benchmark</h2>
<p>Evaluating language models is a science. There are many papers written on the subject, and if you're building a language model for general use, you should definitely read them. But if you're reading this article, probably all you care about is which model you should use in your application. You don't care about most cases. You care about <strong>your case</strong>, and how the model works <strong>for you</strong>. To understand how well a model works for you, you should put together a benchmark.</p>
<h3>Benchmark building blocks</h3>
<p>Typically, a benchmark consists of the following elements:</p>
<ul>
<li>one or more test cases including the test scenario (prompt and parameters) and expected outcome</li>
<li>evaluation criteria</li>
<li>scoring system</li>
</ul>
<p>Test cases represent how your application is using the language model. For each test case, you need to include one or more expected results. Evaluation criteria is how you're going to verify in a repeatable way, that the actual result for a test case is like the one you expected. The scoring system allows you to quickly compare the results for each model you tested and tell which one works the best for you.</p>
<h3>Benchmark for a language model</h3>
<p>Since we're talking about testing language models, there are a few more things that you need to consider while building a benchmark.</p>
<p>Language models are generative and their output is non-deterministic. If you run the same prompt twice, you'll likely get two different answers. This means, that:</p>
<ol>
<li>To get a representative result, you need to run each test case several times.</li>
<li>There are probably several correct answers to each test case.</li>
<li>You can't use a simple string comparison to evaluate the actual and expected output.</li>
</ol>
<h4>Evaluating language model output</h4>
<p>To solve the first challenge, you need to run each test case and get an average score.</p>
<p>For the second part, you should provide several reference examples. When scoring language model's output against these references, take the highest score.</p>
<p>The final problem is a bit trickier. The good news is, that it's not a new problem, and there are several commonly used functions for scoring language model output, such as BLEU, ROUGE, BERT or edit distance. The challenging part is, that you'll likely need a combination of these functions to determine the efficacy of a model for your scenario. To do that, you'll define a weighted score, where each function gets assigned importance (weight) for the overall score.</p>
<h4>Example evaluations for Dev Proxy scenarios</h4>
<p>To put it all in practice, let's look at two scenarios that we use in Dev Proxy.</p>
<h5>Generate API operation ID</h5>
<p>One way we use language models in Dev Proxy is to, given a request method and URL, generate the ID of the API operation when generating OpenAPI specs. For example, for a request like: <code>POST https://graph.microsoft.com/users/{users-id}/calendars</code> we want to get <code>addUserCalendar</code> or an equivalent.</p>
<p>For this case, we use the following functions and weights:</p>
<table>
<thead>
<tr>
<th>Function</th>
<th style="text-align:right">Weight</th>
</tr>
</thead>
<tbody>
<tr>
<td>BERT-F</td>
<td style="text-align:right">0.45</td>
</tr>
<tr>
<td>Edit distance</td>
<td style="text-align:right">0.10</td>
</tr>
<tr>
<td>ROUGE-1</td>
<td style="text-align:right">0.25</td>
</tr>
<tr>
<td>ROUGE-L</td>
<td style="text-align:right">0.20</td>
</tr>
</tbody>
</table>
<p>We assign the most weight to BERT-F because we want the generated ID to be semantically accurate. It would be nice if the generated ID matched one of our expected IDs (ROUGE-L), and the words overlap the better (ROUGE-1). Finally, we don't want the ID to deviate too much from our examples (Edit distance).</p>
<p>You could have endless debate about each weight. It doesn't really matter if you assign 0.45 or 0.43 to a function. What matters, is that you can explain what characteristics of the answer you care about and why, and that answers that are good in your eyes, get good scores.</p>
<h5>Generate API operation description</h5>
<p>Another scenario for which we use language models in Dev Proxy is to generate an operation's description for use in an OpenAPI spec. For example, given a request <code>GET https://api.contoso.com/users/{users-id}/calendars</code>, we want to get <code>Retrieve a user's calendars</code> or something similar. For this case, we use the following evaluation criteria:</p>
<table>
<thead>
<tr>
<th>Function</th>
<th style="text-align:right">Weight</th>
</tr>
</thead>
<tbody>
<tr>
<td>BERT-F</td>
<td style="text-align:right">0.45</td>
</tr>
<tr>
<td>Edit distance</td>
<td style="text-align:right">0.10</td>
</tr>
<tr>
<td>ROUGE-2</td>
<td style="text-align:right">0.25</td>
</tr>
<tr>
<td>ROUGE-L</td>
<td style="text-align:right">0.20</td>
</tr>
</tbody>
</table>
<p>In this case, we replaced ROUGE-1 with ROUGE-2. When generating operation ID, we looked for overlap of the different tokes (eg. <code>getUserById</code> vs. <code>getUserByIdentifier</code> matches 3 out of 4 words). Since description is more elaborate, we decided to look at the matching word pairs than single words.</p>
<p>Again, this isn't <em>the</em> way to evaluate a language model output. This is the way <em>we</em> chose to use for <em>our</em> scenario. I encourage you to experiment with different functions and weights to see what's working for you.</p>
<h4>A word on BERTScore</h4>
<p>BERTScore is a great and convenient way to see if the generated text is semantically similar to the reference. The only trouble is that, comparing to other scoring functions, weak results in BERTScore are scored up to 0.8 out of 1. So if you combine it with other functions, you'll get skewed results. To avoid this, you need to normalize its value, for example: x &gt; 0.95 give full contribution (1), 0.75&lt;= x &lt;0.95, scale linearly, and for x &lt; 0.75 drop to 0.</p>
<h3>Putting it all together</h3>
<p>Putting it all together, you'll end up with a solution that consists of the following building blocks:</p>
<ul>
<li>test scenario based on a parametrized language model prompt</li>
<li>one or more test cases, each with values for the prompt parameters and several reference values</li>
<li>weighted score based on several scoring functions</li>
<li>test runner that:
<ul>
<li>runs the test scenario against a language model,</li>
<li>evaluates, and scores its output</li>
<li>presents the comparison result</li>
</ul>
</li>
</ul>
<p><picture>
<source srcset="https://blog.mastykarz.nl/assets/images/2025/06/language-model-results-per-model-per-case.webp" type="image/webp">
<img src="https://blog.mastykarz.nl/assets/images/2025/06/language-model-results-per-model-per-case.png" alt="Bar chart comparing language model performance across four different tasks. Shows average combined scores for qwen2.5:3b, phi3.5, llama3.2, and phi3 models on getCalendarForUser, getCalendarForUser, getUser, and postCalendarForUser tasks. A horizontal red line at 0.8 marks the performance threshold. Most models score above 0.9 on the first three tasks, with phi3.5, llama3.2, and qwen2.5:3b performing consistently well. However, phi3 shows dramatically lower performance on the first task at approximately 0.146, while all other scores remain above 0.8. The chart demonstrates significant performance variation between models and tasks, with phi3 being particularly unsuitable for certain operations despite adequate performance on others.">
</picture></p>
<p><picture>
<source srcset="https://blog.mastykarz.nl/assets/images/2025/06/language-model-results-per-model.webp" type="image/webp">
<img src="https://blog.mastykarz.nl/assets/images/2025/06/language-model-results-per-model.png" alt="Bar chart showing average combined scores for four language models in optimized operation ID generation. From left to right: phi3.5 scored 0.967, llama3.2 scored 0.954, qwen2.5:3b scored 0.922, and phi3 scored 0.517. A red horizontal line at 0.8 indicates the performance threshold. The chart demonstrates phi3.5 and llama3.2 as top performers, while phi3 significantly underperforms compared to the others.">
</picture></p>
<p>Check out the <a href="https://github.com/waldekmastykarz/ollama-compare">sample testing workbench</a> that I created and that allows you to evaluate language models running on Ollama. I chose to build it as a Jupyter Notebook using Python, for several reasons:</p>
<ul>
<li>Python has ready to use libraries with scoring functions, data manipulation and presentation logic</li>
<li>Since it's a fully fledged programming language, I can implement caching the language models and responses for quicker execution</li>
<li>Jupyter Notebooks:
<ul>
<li>allow you to combine code and content with additional information</li>
<li>can be run interactively</li>
<li>combined with VSCode extensions such as <a href="https://marketplace.visualstudio.com/items?itemName=ms-toolsai.datawrangler">Data Wrangler</a> allow you to explore the variables in your notebook which is invaluable for debugging and understanding the results</li>
<li>persist last execution state. If you change something, you can conveniently re-run the notebook from the change part, speeding up the execution</li>
</ul>
</li>
</ul>
<p>I'm working on a similar version based on OpenAI-compatible APIs, and which will also support <a href="https://prompty.ai">Prompty</a> files.</p>
<h2>Trust but verify</h2>
<p>Next time you're building an application that uses a language model, be sure to test a few models and compare the results. Don't just blindly trust generic benchmarks that aren't representative of how you use language models in your applications, but see what's working best for you. When you change the model you use or update the prompt, re-run your tests and compare the results. You'll be surprised to see how much difference a model or a prompt can make. When you start testing how you use language models, not only you'll see which model works best for you, but you'll also discover which prompts you could improve. If a model scores well in one test, but poorly in another, and the scenarios aren't wildly different, try changing the prompt. Stay curious.</p>
</description><pubDate>Tue, 17 Jun 2025 15:11:40 GMT</pubDate></item><item><title>API request encoding matters</title><link>https://blog.mastykarz.nl/api-request-encoding-matters/</link><guid isPermaLink="true">https://blog.mastykarz.nl/api-request-encoding-matters/</guid><description><p><img src="https://blog.mastykarz.nl/assets/images/2025/06/banner-api-encoding.png" alt="API request encoding matters" class="webfeedsFeaturedVisual" /></p><p>When you submit data to an API, the character encoding is crucial to ensure that the data is processed properly, especially if you're working with UTF characters.</p>
<p>Consider the following HTTP request:</p>
<pre><code class="language-http">POST http://api.ecs.eu/feedback
Content-Type: application/json
{
&quot;id&quot;: {{$randomInt 9 1000000}},
&quot;feedback&quot;: &quot;🫠&quot;,
&quot;date&quot;: &quot;{{$datetime iso8601}}&quot;
}
</code></pre>
<p>Notice, that the <code>content-type</code> is set to <code>application/json</code> without a specific encoding, and the <code>feedback</code> property contains an emoji. Most likely, the API will respond with something like:</p>
<pre><code class="language-http">HTTP/1.1 201 Created
content-type: application/json; charset=utf-8
Content-Length: 82
{
&quot;id&quot;: 972680,
&quot;feedback&quot;: &quot;ð\x9f« &quot;,
&quot;date&quot;: &quot;2025-06-16T11:33:28.326Z&quot;
}
</code></pre>
<p>Notice how the contents of the <code>feedback</code> property are broken. This is caused by the lack of encoding in the request. To fix this, extend the <code>content-type</code> header with the encoding:</p>
<pre><code class="language-http">POST http://api.ecs.eu/feedback
Content-Type: application/json; charset=utf-8
{
&quot;id&quot;: {{$randomInt 9 1000000}},
&quot;feedback&quot;: &quot;😎&quot;,
&quot;date&quot;: &quot;{{$datetime iso8601}}&quot;
}
</code></pre>
<p>Now, the API will respond with the correct encoding:</p>
<pre><code class="language-http">HTTP/1.1 201 Created
content-type: application/json; charset=utf-8
Content-Length: 82
{
&quot;id&quot;: 972680,
&quot;feedback&quot;: &quot;😎&quot;,
&quot;date&quot;: &quot;2025-06-16T11:33:28.326Z&quot;
}
</code></pre>
<p>This issue caught me off guard while presenting how to use <a href="https://aka.ms/devproxy">Dev Proxy</a> to <a href="https://learn.microsoft.com/microsoft-cloud/dev/dev-proxy/how-to/simulate-crud-api">emulate CRUD APIs</a> at the recent European Cloud Summit. While I was first suspecting an issue with Dev Proxy, it turned out to be a misconfiguration of the client that I was using to submit the request to the API.</p>
</description><pubDate>Mon, 16 Jun 2025 13:56:17 GMT</pubDate></item><item><title>Use Prompty with Foundry Local</title><link>https://blog.mastykarz.nl/prompty-foundry-local/</link><guid isPermaLink="true">https://blog.mastykarz.nl/prompty-foundry-local/</guid><description><p><img src="https://blog.mastykarz.nl/assets/images/2025/06/banner.png" alt="Use Prompty with Foundry Local" class="webfeedsFeaturedVisual" /></p><p><a href="https://prompty.ai/">Prompty</a> is a powerful tool for managing prompts in AI applications. Not only does it allow you to easily test your prompts during development, but it also provides observability, understandability and portability. Here's how to use Prompty with Foundry Local to support your AI applications with on-device inference.</p>
<h2>Foundry Local</h2>
<p>At the Build '25 conference, Microsoft announced <a href="https://devblogs.microsoft.com/foundry/unlock-instant-on-device-ai-with-foundry-local/">Foundry Local</a>, a new tool that allows developers to run AI models locally on their devices. Foundry Local offers developers several benefits, including performance, privacy, and cost savings.</p>
<h2>Why Prompty?</h2>
<p>When you build AI applications with Foundry Local, but also other language model hosts, consider using Prompty to manage your prompts. With Prompty, you store your prompts in separate files, making it easy to test and adjust them without changing your code. Prompty also supports templating, allowing you to create dynamic prompts that adapt to different contexts or user inputs.</p>
<h2>Using Prompty with Foundry Local</h2>
<p>The most convenient way to use Prompty with Foundry Local is to create a new configuration for Foundry Local. Using a separate configuration allows you to seamlessly test your prompts without having to repeat the configuration for every prompt. It also allows you to easily switch between different configurations, such as Foundry Local and other language model hosts.</p>
<h3>Install Prompty and Foundry Local</h3>
<p>To get started, install the <a href="https://marketplace.visualstudio.com/items?itemName=ms-toolsai.prompty">Prompty Visual Studio Code extension</a> and <a href="https://learn.microsoft.com/en-us/azure/ai-foundry/foundry-local/get-started">Foundry Local</a>. Start Foundry Local from the command line by running <code>foundry service start</code> and note the URL on which it listens for requests, such as <code>http://localhost:5272</code> or <code>http://localhost:5273</code></p>
<h3>Create a new Prompty configuration for Foundry Local</h3>
<p>If you don't have a Prompty file yet, create one to easily access Prompty settings. In Visual Studio Code, open <strong>Explorer</strong>, click right to open the context menu, and select <strong>New Prompty</strong>. This creates a <code>basic.prompty</code> file in your workspace.</p>
<p><picture>
<source srcset="https://blog.mastykarz.nl/assets/images/2025/06/prompty-file.webp" type="image/webp">
<img src="https://blog.mastykarz.nl/assets/images/2025/06/prompty-file.png" alt="Visual Studio Code Explorer panel showing a newly created Prompty file named basic.prompty. The file is also opened in the editor.">
</picture></p>
<h3>Create the Foundry Local configuration</h3>
<p>From the status bar, select <strong>default</strong> to open the Prompty configuration picker. When prompted to select the configuration, choose <strong>Add or Edit...</strong>.</p>
<p><picture>
<source srcset="https://blog.mastykarz.nl/assets/images/2025/06/prompty-add-edit-configuration.webp" type="image/webp">
<img src="https://blog.mastykarz.nl/assets/images/2025/06/prompty-add-edit-configuration.png" alt="Prompty configuration picker in the Visual Studio Code status bar, showing the default configurations and an option to add or edit configurations.">
</picture></p>
<p>In the settings pane, choose <strong>Edit in settings.json</strong>.</p>
<p><picture>
<source srcset="https://blog.mastykarz.nl/assets/images/2025/06/prompty-edit-model-configurations-settings.webp" type="image/webp">
<img src="https://blog.mastykarz.nl/assets/images/2025/06/prompty-edit-model-configurations-settings.png" alt="Visual Studio Code settings interface displaying the Prompty Model Configurations section. The interface shows options to configure model settings with an Edit in settings.json button prominently displayed.">
</picture></p>
<p>In the <code>settings.json</code> file, to the <strong>prompty.modelConfigurations</strong> collection, add a new configuration for Foundry Local, for example (ignore comments):</p>
<pre><code class="language-json">{
// Foundry Local model ID that you want to use
&quot;name&quot;: &quot;Phi-4-mini-instruct-generic-gpu&quot;,
// API type; Foundry Local exposes OpenAI-compatible APIs
&quot;type&quot;: &quot;openai&quot;,
// API key required for the OpenAI SDK, but not used by Foundry Local
&quot;api_key&quot;: &quot;local&quot;,
// The URL where Foundry Local exposes its API
&quot;base_url&quot;: &quot;http://localhost:5272/v1&quot;
}
</code></pre>
<blockquote>
<p><strong>Important</strong>: Be sure to check that you use the correct URL for Foundry Local. If you started Foundry Local with a different port, adjust the URL accordingly.</p>
</blockquote>
<p>Save your changes, and go back to the <code>.prompty</code> file. Once again, select the <strong>default</strong> configuration from the status bar, and from the list choose <strong>Phi-4-mini-instruct-generic-gpu</strong>.</p>
<p><picture>
<source srcset="https://blog.mastykarz.nl/assets/images/2025/06/prompty-select-configuration.webp" type="image/webp">
<img src="https://blog.mastykarz.nl/assets/images/2025/06/prompty-select-configuration.png" alt="Visual Studio Code dropdown menu displaying Prompty configuration options. The menu shows several configuration choices including default, Phi-4-mini-instruct-generic-gpu (which appears to be highlighted or selected), and other model configurations.">
</picture></p>
<p>Since the model and API are configured, you can remove them from the <code>.prompty</code> file.</p>
<p><picture>
<source srcset="https://blog.mastykarz.nl/assets/images/2025/06/prompty-no-config.webp" type="image/webp">
<img src="https://blog.mastykarz.nl/assets/images/2025/06/prompty-no-config.png" alt="Visual Studio Code editor displaying a basic.prompty file with the model and api sections removed from the configuration.">
</picture></p>
<h3>Test your prompts</h3>
<p>With the newly created Foundry Local configuration selected, in the <code>.prompty</code> file, press <kbd>F5</kbd> to test the prompt.</p>
<p>The first time you run the prompt, it may take a few seconds because Foundry Local needs to load the model.</p>
<p><picture>
<source srcset="https://blog.mastykarz.nl/assets/images/2025/06/prompty-running.webp" type="image/webp">
<img src="https://blog.mastykarz.nl/assets/images/2025/06/prompty-running.png" alt="Visual Studio Code editor displaying notifcation message for the Prompty extension that it's using the Phi-4-mini-instruct-generic-gpu model configuration.">
</picture></p>
<p>Eventually, you should see the response from Foundry Local in the output pane.</p>
<p><picture>
<source srcset="https://blog.mastykarz.nl/assets/images/2025/06/prompty-foundry-local-output.webp" type="image/webp">
<img src="https://blog.mastykarz.nl/assets/images/2025/06/prompty-foundry-local-output.png" alt="Visual Studio Code editor displaying the response from Foundry Local in the output panel.">
</picture></p>
<h2>Summary</h2>
<p>Using Prompty with Foundry Local allows you to easily manage and test your prompts while running AI models locally. By creating a dedicated Prompty configuration for Foundry Local, you can conveniently test your prompts with Foundry Local models and switch between different model hosts and models if needed.</p>
</description><pubDate>Fri, 13 Jun 2025 17:11:39 GMT</pubDate></item><item><title>Integrate Microsoft 365 Copilot declarative agents with Azure AI Search</title><link>https://blog.mastykarz.nl/integrate-microsoft-365-copilot-declarative-agents-azure-ai-search/</link><guid isPermaLink="true">https://blog.mastykarz.nl/integrate-microsoft-365-copilot-declarative-agents-azure-ai-search/</guid><description><p><img src="https://blog.mastykarz.nl/assets/images/2025/05/banner.png" alt="Integrate Microsoft 365 Copilot declarative agents with Azure AI Search" class="webfeedsFeaturedVisual" /></p><p>Bringing knowledge to Microsoft 365 Copilot declarative agents using Azure AI Search is the one option that we don't talk about enough. And it's a shame, because it's a great way to strike the balance between having more control over indexing and relevance, without the complexity of standing up a custom engine agent with a language model deployment. Here's why.</p>
<h2>Know it all</h2>
<p>We all want our agents to know as much as possible. After all, the more they know, the more helpful they are, right? While the language models that back them keep improving, they're still far from including everything. A lot of the knowledge that we need agents to have is specific to our organization. Training a custom model on organizational knowledge, in most cases, just doesn't make sense. The information changes too often. It's also not business-critical enough to justify the cost of training a model for most. In many cases, knowledge items also need to be secured so that only those with the right permissions can see them. This is why we're resorting to RAG as the mechanism to bring additional knowledge to our agents.</p>
<h2>RAG on Microsoft 365 Copilot agents</h2>
<p>When building Microsoft 365 Copilot agents, we have several options for bringing additional knowledge to them.</p>
<p><picture>
<source srcset="https://blog.mastykarz.nl/assets/images/2025/05/m365-agent-rag.webp" type="image/webp">
<img src="https://blog.mastykarz.nl/assets/images/2025/05/m365-agent-rag.png" alt="Different options for bringing additional knowledge to Microsoft 365 Copilot agents and their pros and cons">
</picture></p>
<h3>OneDrive/SharePoint</h3>
<p>The first, and the most obvious option, is to use OneDrive or SharePoint to store your knowledge items. It's a hands-off approach that just works. All you need to do is upload your files to a document library, and Microsoft 365 will take care of the rest. It'll automatically index the files and make them available to Microsoft 365 Copilot and its agents. It also automatically ensures that only those with the right permissions can see the files in agents' responses. But this simplicity comes at a cost.</p>
<p>Microsoft 365 doesn't support indexing all file types. While it does support a lot of them, if your organization uses one that's not supported, Microsoft 365 won't index them, and there's nothing you can do about it.</p>
<p>You also don't get to choose how Microsoft 365 indexes your files. You simply upload files, and Microsoft 365 takes care of the rest. You have no control over how the files' contents are parsed and how they're chunked.</p>
<p>Finally, you don't get to choose the relevance of the files. For most, that's a good thing, because let's face it: most of us aren't experts in relevance algorithms and search strategies. But if you understand your domain knowledge and have specific requirements for telling which knowledge items are more relevant than others, you might want to have more control over the relevance.</p>
<h3>Custom engine agents</h3>
<p>On the other side of the spectrum are custom engine agents. Building a custom engine agent means that you're bringing the full infrastructure: from the compute resource that hosts the agent to the language model that powers it. While it gives you full control over indexing additional data sources, tool usage, and orchestration, it also means additional complexity to ensure security and responsible use of AI. There's however a middle ground between the two extremes.</p>
<h3>Microsoft Graph connectors</h3>
<p>Another option at your disposal is to use Microsoft Graph connectors. Graph connectors ingest data from external systems into Microsoft 365. There are many connectors available in Microsoft 365 for you to use, and you can also build your own.</p>
<p>If your main challenge is having knowledge stored in unsupported data formats, then Graph connectors are a great option you should consider. Using Graph connectors, you basically create a push-based search index inside Microsoft 365. You define the schema for your data, specify which properties are searchable and filterable, and push the content, along with its access control list, to Microsoft 365.</p>
<p>Since you control pushing the content to Microsoft 365, you have some degree of control over how the content is indexed. It's your responsibility to extract the content from the source system and transform it into a structure compatible with Microsoft 365. This allows you to parse complex elements and include all the necessary information to convey their meaning. If your content items are large, you can also break them down into manageable chunks to help Microsoft 365 find the relevant information. For each chunk, you can also specify the URL to the original content, so that when users get a citation, they're pointed to the source system. There's, however, one thing that Graph connectors don't let you control.</p>
<p>When you ingest content into Microsoft 365 using Graph connectors, you don't get controls to define the relevance of the content. Microsoft 365 uses its predefined relevance algorithms to determine the relevance of the content. If you have specific requirements for relevance, the default settings might not give you the results you're looking for.</p>
<h3>Azure AI Search</h3>
<p>Often, we turn to building custom engine agents too soon, forgetting that we can use Azure AI Search to bring additional knowledge to Microsoft 365 Copilot agents. Azure AI Search is a managed search service that allows you to build search-based solutions. Similarly to Graph connectors, Azure AI Search allows you to have full control over indexing your content. But unlike other solutions, it also allows you to define custom ranking models. This means that you can define which information is the most relevant for your organization. As a bonus, you can decide whether you want to use lexical, semantic, vector, or hybrid search. This gives you the flexibility to choose the best search strategy for your organization and scenario.</p>
<p>Azure AI Search is a service on Microsoft Azure, and there is a cost involved with using it. Still, if your solution requires controlling context indexing and relevance, it's a great solution that you should consider before building a custom engine agent.</p>
<h2>Integrating declarative agents with Azure AI Search</h2>
<p>Integrating declarative agents with Azure AI Search is no different than <a href="https://learn.microsoft.com/training/paths/copilot-microsoft-365-declarative-agents-api-plugins-visual-studio-code/">integrating an API plugin</a>. For the declarative agent, Azure AI Search is just another API. By default, Azure AI Search is secured with an API key. To integrate it with a declarative agent, register the API key in the Teams Developer Portal and include the registration ID in the agent's manifest. Then, you include the Azure AI Search's API spec in the project. Here's an example:</p>
<pre><code class="language-yaml">openapi: 3.0.4
info:
title: Knowledge base search
description: This API allows you to search for relevant knowledge in Azure AI Search indexes.
version: v1.0
servers:
- url: https://${{AI_SEARCH_INSTANCE}}.search.windows.net
security:
- apiKeyAuth: []
paths:
/indexes/{index}/docs/search:
post:
x-openai-isConsequential: false
description: Search knowledge in Azure AI Search indexes.
operationId: searchKnowledge
security:
- apiKeyAuth: []
parameters:
- name: api-version
in: query
required: true
schema:
type: string
default: &quot;2024-11-01-preview&quot;
- name: index
in: path
required: true
schema:
type: string
default: &quot;${{AI_SEARCH_INDEX}}&quot;
requestBody:
content:
application/json:
schema:
type: object
properties:
search:
type: string
count:
type: boolean
default: true
queryType:
type: string
default: semantic
semanticConfiguration:
type: string
default: knowledge
captions:
type: string
default: extractive
answers:
type: string
default: extractive|count-3
queryLanguage:
type: string
required:
- search
- count
- queryType
- semanticConfiguration
- captions
- answers
responses:
'200':
description: OK
content:
application/json:
schema:
type: string
components:
securitySchemes:
apiKeyAuth:
type: apiKey
in: header
name: api-key
description: API key for Azure AI Search
</code></pre>
<p>Finally, describe the search API operation in an API plugin for the agent:</p>
<pre><code class="language-json">{
&quot;$schema&quot;: &quot;https://developer.microsoft.com/json-schemas/copilot/plugin/v2.2/schema.json&quot;,
&quot;schema_version&quot;: &quot;v2.2&quot;,
&quot;name_for_human&quot;: &quot;Knowledge base&quot;,
&quot;description_for_human&quot;: &quot;This plugin allows you to search knowledge in Azure AI Search indexes.&quot;,
&quot;description_for_model&quot;: &quot;This plugin allows you to search knowledge in Azure AI Search indexes.&quot;,
&quot;namespace&quot;: &quot;m365knowledge&quot;,
&quot;functions&quot;: [
{
&quot;name&quot;: &quot;searchKnowledge&quot;,
&quot;description&quot;: &quot;Search knowledge in Azure AI Search indexes.&quot;,
&quot;capabilities&quot;: {
&quot;response_semantics&quot;: {
&quot;data_path&quot;: &quot;$.value&quot;,
&quot;properties&quot;: {
&quot;title&quot;: &quot;title&quot;,
&quot;url&quot;: &quot;url&quot;
},
&quot;static_template&quot;: {
&quot;file&quot;: &quot;adaptiveCards/searchKnowledge.json&quot;
}
}
}
}
],
&quot;runtimes&quot;: [
{
&quot;type&quot;: &quot;OpenApi&quot;,
&quot;auth&quot;: {
&quot;type&quot;: &quot;ApiKeyPluginVault&quot;,
&quot;reference_id&quot;: &quot;${{APIKEYAUTH_REGISTRATION_ID}}&quot;
},
&quot;spec&quot;: {
&quot;url&quot;: &quot;apiSpecificationFile/openapi.yaml&quot;
},
&quot;run_for_functions&quot;: [
&quot;searchKnowledge&quot;
]
}
]
}
</code></pre>
<h2>Summary</h2>
<p>Bringing knowledge to Microsoft 365 Copilot declarative agents using Azure AI Search is a great way to strike the balance between having more control over indexing and relevance, without the complexity of standing up a custom engine agent with a language model deployment. It allows you to define custom ranking models and choose the best search strategy for your organization. If you're looking for a solution that gives you more control over indexing and relevance, consider using Azure AI Search before building a custom engine agent.</p>
</description><pubDate>Mon, 12 May 2025 14:21:08 GMT</pubDate></item><item><title>Calculate the number of language model tokens for a string</title><link>https://blog.mastykarz.nl/calculate-number-language-model-tokens-string/</link><guid isPermaLink="true">https://blog.mastykarz.nl/calculate-number-language-model-tokens-string/</guid><description><p><img src="https://blog.mastykarz.nl/assets/images/2025/01/banner.png" alt="Calculate the number of language model tokens for a string" class="webfeedsFeaturedVisual" /></p><p>Here's an easy way to calculate the number of language model tokens for a string.</p>
<p>When working with language models, you might need to know how many tokens are in a string. Often, you use this information to estimate the cost for running your application. You might also need to know the number of tokens to tell if your text fits into the context window of the language model you're working with, or if you need to chunk it first.</p>
<p>You can roughly estimate the number of tokens by dividing the number of characters in your string by 4. This is however a <em>very rough estimate</em>. In reality, the actual number of tokens strongly depends on the language model you're using. So for your calculations to be correct, you want to base them on the actual language model that you're using.</p>
<p>To help you calculate the number of tokens in a string, I put together a <a href="https://github.com/waldekmastykarz/python-tokenize">Jupyter Notebook</a>. It allows you to calculate the number of tokens for a string, a file, or all files in a folder.</p>
<p>After you clone the repo, start by restoring all dependencies. You can do this using <a href="https://docs.astral.sh/uv/">uv</a>, by running:</p>
<pre><code class="language-sh">uv sync
</code></pre>
<p>Next, open the notebook. Start with specifying the string/file/folder for which you want to calculate the number of tokens. Then, choose the language model you're using in your application. You can either use a Hugging Face or OpenAI model. For Hugging Face models, specify the name in the <code>user/model_name</code> format. For OpenAI models, simply pass the name of the model. Here's the <a href="https://github.com/openai/tiktoken/blob/63527649963def8c759b0f91f2eb69a40934e468/tiktoken/model.py#L22-L72">list of supported models</a>.</p>
<p><picture>
<source srcset="https://blog.mastykarz.nl/assets/images/2025/01/notebook-configuration.webp" type="image/webp">
<img src="https://blog.mastykarz.nl/assets/images/2025/01/notebook-configuration.png" alt="Screenshot of a cell in a Jupyter notebook showing the configuration options">
</picture></p>
<p>Finally, run all cells to see the number of tokens for your string/file/folder. Scroll down to the respective cell to see the output.</p>
<p><picture>
<source srcset="https://blog.mastykarz.nl/assets/images/2025/01/number-tokens-text.webp" type="image/webp">
<img src="https://blog.mastykarz.nl/assets/images/2025/01/number-tokens-text.png" alt="Output of a cell in a Jupyter notebook showing the number of tokens for a string">
</picture></p>
<p>What's great about calculating the number of tokens used in a string using a Jupyter Notebook is that it's running on your machine. It doesn't depend on an external service, it doesn't incur any costs and allows you to calculate the number of tokens for your strings in a secure way for virtually any language model.</p>
<p>Try it and let me know what you think!</p>
</description><pubDate>Mon, 20 Jan 2025 12:07:40 GMT</pubDate></item><item><title>Create npm package with CommonJS and ESM support in TypeScript</title><link>https://blog.mastykarz.nl/create-npm-package-commonjs-esm-typescript/</link><guid isPermaLink="true">https://blog.mastykarz.nl/create-npm-package-commonjs-esm-typescript/</guid><description><p><img src="https://blog.mastykarz.nl/assets/images/2024/06/npm-cjs-esm.png" alt="Create npm package with CommonJS and ESM support in TypeScript" class="webfeedsFeaturedVisual" /></p><p>If you want to create an npm package and ensure it can be used by everyone, you'll want it to support CommonJS (CJS) and ECMAScript Modules (ESM). Here's how to build such a package using TypeScript.</p>
<h2>CommonJS and ESM</h2>
<p>When building JavaScript apps, you have two module systems to choose from: CommonJS and ECMAScript Modules. Despite the recent rise of ESM, CommonJS is still widely used, and not to mention default in Node.js. To make sure your npm package can be used by everyone, you'll want to support both module systems.</p>
<h2>Project setup</h2>
<p>CommonJS and ESM are not compatible with each other. To support both, your package needs to contain two versions of the code, one for CommonJS and one for ESM. To support usage in TypeScript, you'll also need to include type definitions. Since the shape of the exported API is the same, you can use the same TypeScript code for both CommonJS and ESM.</p>
<p>Here's the global structure of the code that your package should produce that you're after:</p>
<pre><code class="language-plaintext">project
├── dist // package output
│ ├── cjs
│ │ └── CJS code
│ ├── esm
│ │ └── ESM code
│ └── types
│ └── TypeScript type definitions
└── src // source code
└── TypeScript source
</code></pre>
<p>The following sections describe how to set up the project to produce such output.</p>
<blockquote>
<p>Note: Check out the full source code of the <a href="https://github.com/waldekmastykarz/node-ts-cjs-esm">sample project</a> on GitHub.</p>
</blockquote>
<h3>TypeScript configuration</h3>
<p>TypeScript can produce only one output at a time. To produce both CommonJS and ESM code, you need two different TypeScript configuration files (<code>tsconfig.json</code>). The good news is, that you can reuse common settings and only specify the distinct parts for CJS and ESM.</p>
<p>Start, by creating a base TypeScript configuration file with shared settings (<code>tsconfig.base.json</code>):</p>
<pre><code class="language-json">{
&quot;compilerOptions&quot;: {
&quot;lib&quot;: [
&quot;esnext&quot;
],
&quot;declaration&quot;: true,
&quot;declarationDir&quot;: &quot;./dist/types&quot;,
&quot;strict&quot;: true,
&quot;esModuleInterop&quot;: true,
&quot;skipLibCheck&quot;: true,
&quot;forceConsistentCasingInFileNames&quot;: true,
&quot;moduleResolution&quot;: &quot;node&quot;,
&quot;baseUrl&quot;: &quot;.&quot;,
&quot;rootDir&quot;: &quot;./src&quot;
},
&quot;include&quot;: [
&quot;src&quot;
],
&quot;exclude&quot;: [
&quot;dist&quot;,
&quot;node_modules&quot;
]
}
</code></pre>
<p>The file specifies, among other things, the location of the source code, folders to exclude, and the output directory for type definitions.</p>
<p>Next, create two separate configuration files for CommonJS and ESM.</p>
<p><code>tsconfig.cjs.json</code>:</p>
<pre><code class="language-json">{
&quot;extends&quot;: &quot;./tsconfig.base.json&quot;,
&quot;compilerOptions&quot;: {
&quot;module&quot;: &quot;commonjs&quot;,
&quot;outDir&quot;: &quot;./dist/cjs&quot;,
&quot;target&quot;: &quot;ES2015&quot;
}
}
</code></pre>
<p><code>tsconfig.esm.json</code>:</p>
<pre><code class="language-json">{
&quot;extends&quot;: &quot;./tsconfig.base.json&quot;,
&quot;compilerOptions&quot;: {
&quot;module&quot;: &quot;esnext&quot;,
&quot;outDir&quot;: &quot;./dist/esm&quot;,
&quot;target&quot;: &quot;esnext&quot;
}
}
</code></pre>
<p>Notice, how both files extend the base configuration and only specify the module system and the distinct output directory.</p>
<h3>package.json configuration</h3>
<p>After setting up TypeScript, continue with the <code>package.json</code> configuration. Here, you need to do a few things. You need to specify entry points for CJS and ESM consumers and define scripts to build the package in both formats.</p>
<h4>Define entry points</h4>
<p>Start, by defining entry points for CJS and ESM consumers.</p>
<pre><code class="language-json">{
&quot;name&quot;: &quot;ts-cjs-esm&quot;,
&quot;version&quot;: &quot;1.0.0&quot;,
&quot;main&quot;: &quot;./dist/cjs/index.js&quot;,
&quot;module&quot;: &quot;./dist/esm/index.mjs&quot;,
&quot;types&quot;: &quot;./dist/types/index.d.ts&quot;,
&quot;files&quot;: [
&quot;dist&quot;
]
// ...
}
</code></pre>
<p>You've got the main entry point to refer to the CJS code, and the module entry point to ESM code. Both formats reuse the same TypeScript type definitions. Unfortunately, the <code>main</code> property overrules the <code>module</code> property in Node.js, so you need an extra way to specify how different consumers should load the library. You can do this using the <code>exports</code> property.</p>
<pre><code class="language-json">{
&quot;name&quot;: &quot;ts-cjs-esm&quot;,
&quot;version&quot;: &quot;1.0.0&quot;,
&quot;main&quot;: &quot;./dist/cjs/index.js&quot;,
&quot;module&quot;: &quot;./dist/esm/index.mjs&quot;,
&quot;types&quot;: &quot;./dist/types/index.d.ts&quot;,
&quot;exports&quot;: {
&quot;.&quot;: {
&quot;require&quot;: &quot;./dist/cjs/index.js&quot;,
&quot;import&quot;: &quot;./dist/esm/index.mjs&quot;
}
},
&quot;files&quot;: [
&quot;dist&quot;
]
// ...
}
</code></pre>
<p>Using the <code>exports</code> property, you can specify how different consumers should load the library. The <code>require</code> property points to the CJS code and the <code>import</code> property to the ESM code.</p>
<h4>.mjs vs .js file extension</h4>
<p>You might've noticed, that the ESM code has the <code>.mjs</code> extension while CJS uses <code>.js</code>. This is necessary to distinguish ESM from CJS code. If your package contains only ESM code, in the <code>package.json</code> file you can configure the <code>type</code> property to <code>module</code> to indicate that the package contains ESM code. This instructs Node.js to treat all files as ES modules. Unfortunately, in this case, that's not possible, because the package contains both CJS and ESM code. This means, that you need to use the <code>.mjs</code> extension to designate ESM files. One more thing that complicates matters some more is, that TypeScript doesn't allow you to specify the output file extension and always produces <code>.js</code> files. To work around this, you need to rename the output files, and imports after the build.</p>
<h4>Define npm scripts</h4>
<p>With the basic package setup in place, let's continue with defining a few scripts to build the package in both formats.</p>
<pre><code class="language-json">{
&quot;name&quot;: &quot;ts-cjs-esm&quot;,
// ...
&quot;scripts&quot;: {
&quot;build:cjs&quot;: &quot;tsc -p tsconfig.cjs.json&quot;,
&quot;build:esm&quot;: &quot;tsc -p tsconfig.esm.json &amp;&amp; npm run rename:esm&quot;,
&quot;build&quot;: &quot;npm run build:cjs &amp;&amp; npm run build:esm&quot;,
&quot;clean&quot;: &quot;rimraf dist&quot;,
&quot;rename:esm&quot;: &quot;/bin/zsh ./scripts/fix-mjs.sh&quot;,
&quot;prepack&quot;: &quot;npm run clean &amp;&amp; npm run build&quot;
}
// ...
}
</code></pre>
<p>We start with two build scripts, one for CJS and one for ESM, each pointing to the respective TypeScript config file. As mentioned previously, TypeScript doesn't allow you to specify the output file extension, so you need to rename the output files after the build. The <code>rename:esm</code> script runs a shell script that renames the <code>.js</code> files to <code>.mjs</code> and updates import references. For convenience, we also include a script to clean the output directory and build the package before publishing.</p>
<h4>Install dependencies</h4>
<p>In npm scripts, we're using TypeScript and rimraf. Make sure to install them as dev dependencies:</p>
<pre><code class="language-bash">npm install --save-dev typescript rimraf
</code></pre>
<h3>Helper scripts</h3>
<p>To support building ESM code, you need a helper script that renames the output files and updates import references. Create a shell script <code>fix-mjs.sh</code> in the <code>scripts</code> folder:</p>
<pre><code class="language-bash">for file in ./dist/esm/*.js; do
echo &quot;Updating $file contents...&quot;
sed -i '' &quot;s/\.js'/\.mjs'/g&quot; &quot;$file&quot;
echo &quot;Renaming $file to ${file%.js}.mjs...&quot;
mv &quot;$file&quot; &quot;${file%.js}.mjs&quot;
done
</code></pre>
<p>This script iterates over all <code>.js</code> files in the <code>dist/esm</code> folder. For each file, it replaces <code>.js'</code> with <code>.mjs'</code> in the file contents, and then renames the file to have the <code>.mjs</code> extension.</p>
<h3>Git- and npmignore</h3>
<p>If you intend to store your source in a git repo, add a <code>.gitignore</code> file to your project, to avoid including unnecessary files in the repository:</p>
<pre><code class="language-plaintext">dist
node_modules
</code></pre>
<p>Since you're building a library, which you'll likely distribute, you should also add a <code>.npmignore</code> file to exclude unnecessary files from the npm package:</p>
<pre><code class="language-plaintext">scripts
src
</code></pre>
<h3>Sample source</h3>
<p>To test the setup, create simple TypeScript source files in the <code>src</code> folder.</p>
<p><code>myModule.ts</code>:</p>
<pre><code class="language-typescript">export function myFunction() {
return 'Hello World!';
}
</code></pre>
<p><code>index.ts</code>:</p>
<pre><code class="language-typescript">export * from './myModule.js';
</code></pre>
<p>Notice, that when referring to the <code>myModule</code> file, you use the <code>.js</code> extension. This is required for the ESM build to work correctly. When building the ESM package, the helper script updates the extension to <code>.mjs</code>.</p>
<h2>Build the package</h2>
<p>To verify that the setup works, run the build script:</p>
<pre><code class="language-bash">npm run build
</code></pre>
<p>If all is well, you should see the output in the <code>dist</code> folder:</p>
<pre><code class="language-plaintext">dist
├── cjs
│ ├── index.js
│ └── myModule.js
├── esm
│ ├── index.mjs
│ └── myModule.mjs
└── types
├── index.d.ts
└── myModule.d.ts
</code></pre>
<h2>Summary</h2>
<p>When building an npm package, consider supporting both CommonJS and ECMAScript Modules so that your package can be used by everyone. To support CJS and ESM, you need to produce two versions of the code, one for each module system. By configuring your project in a specific way, you can produce a package that supports both module systems and includes TypeScript type definitions. This way, you can provide a seamless experience for everyone using your package, regardless of the module system they use.</p>
</description><pubDate>Sat, 29 Jun 2024 14:44:13 GMT</pubDate></item><item><title>Trace the location of API requests</title><link>https://blog.mastykarz.nl/trace-location-api-requests/</link><guid isPermaLink="true">https://blog.mastykarz.nl/trace-location-api-requests/</guid><description><p><img src="https://blog.mastykarz.nl/assets/images/2024/06/request-headers-src.png" alt="Trace the location of API requests" class="webfeedsFeaturedVisual" /></p><p>These days, it's hard to imagine an app that's not using APIs. APIs give us access to data and insights and allow us to integrate with cloud services easily. But as we use more and more APIs, we struggle with debugging our app. What if a request fails? What if we need to change something about a particular request? How do we know where in our app the request is coming from?</p>
<h2>Tracing API requests</h2>
<p>If you call just a handful of API endpoints, and they're distinct to the specific portions of your app, it's easy to find where the request is coming from. But as your app grows, and you have more and more API requests, it becomes harder to trace the location of each request. It becomes even harder when you use SDKs or libraries and can't search for a specific API URL in your code. What if you could automatically trace the location of each API request in your app? What if each request contained information about where in your code it was called from?</p>
<h2>Tracing API requests in .NET</h2>
<p>To trace the location of API requests in .NET, you need a custom delegating handler. This handler automatically adds information about the location in your code where each API request has been called.</p>
<pre><code class="language-csharp">using System.Diagnostics;
public class CodeLocationDelegatingHandler(HttpMessageHandler innerHandler) : DelegatingHandler(innerHandler)
{
protected override Task&lt;HttpResponseMessage&gt; SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
{
// using Ben.Demystifier to get clearer information
var enhTrace = EnhancedStackTrace.Current();
// skip 3 frames which include this delegating handler and preparing
// the request
var frame = (EnhancedStackFrame)enhTrace.GetFrame(3);
var method = frame.MethodInfo.Name;
// or using System.Diagnostics.StackTrace, but it's less clear
// var frame = new StackFrame(0, true);
// var method = frame.GetMethod()?.Name ?? string.Empty;
var fileName = frame.GetFileName();
var lineNumber = frame.GetFileLineNumber();
request.Headers.Add(&quot;x-src-method&quot;, method);
request.Headers.Add(&quot;x-src&quot;, $&quot;{fileName}:{lineNumber}&quot;);
return base.SendAsync(request, cancellationToken);
}
}
</code></pre>
<p>Then, you can use this handler in your HttpClient:</p>
<pre><code class="language-csharp">var handler = new CodeLocationDelegatingHandler(new HttpClientHandler());
using var httpClient = new HttpClient(handler);
async Task Method1()
{
var response = await httpClient.GetStringAsync(&quot;https://jsonplaceholder.typicode.com/posts&quot;);
Console.WriteLine(response);
}
await Method1();
// The API request contains the following headers:
// x-src-method: &lt;&lt;Main&gt;$&gt;g__Method1|0
// x-src: C:\MyApp\Program.cs:6
</code></pre>
<p>If you looked closely at the code, you might have noticed that we're using the <a href="https://github.com/benaadams/Ben.Demystifier">Ben.Demystifier</a> package. This package helps us get clearer information about the stack trace. We need it because when you use async methods, .NET generates a lot of compiler-generated code, which makes it hard to understand where the request was called from. Also, keep in mind, that the ability to map the location of the request to the specific line in your code depends on the debug symbols (PDB) being available on runtime.</p>
<h2>Tracing API requests in JavaScript</h2>
<p>In JavaScript, to get the information about the location of the API request, you need to use the stack trace. You can get the stack trace by creating an Error object and reading its stack property. Once you have the location in your code, add it to the request headers. How you do it depends on the library you use to make API requests. For example, if you're using fetch, you can add the location information to the request headers:</p>
<pre><code class="language-javascript">function getCallerLocation(error) {
const stack = error.stack.split('\n');
// skip the error message and the current function
const caller = stack[2].trim();
const regex = /at (\w+) $([^)]+)$/;
const match = regex.exec(caller);
const functionName = match[1];
const filePathLocation = match[2];
return { functionName, filePathLocation };
}
async function codeLocationFetch(url, options = {}) {
const { functionName, filePathLocation } = getCallerLocation(new Error());
const defaultHeaders = {
'x-src-method': functionName,
'x-src': filePathLocation
};
const mergedOptions = {
...options,
headers: {
...defaultHeaders,
...options.headers,
},
};
return fetch(url, mergedOptions);
}
async function func1() {
const res = await codeLocationFetch('https://jsonplaceholder.typicode.com/posts');
const json = await res.json();
console.log(json);
}
await func1();
// The API request contains the following headers:
// x-src-method: func1
// x-src: file:///path/to/myapp/index.js:38:15
</code></pre>
<h2>Privacy</h2>
<p>If you don't want to send the information about your code base to third parties, you could instrument your logging infrastructure to log the information about the source of the request, but then remove it from the request before sending it to the API. This way, you can still trace the location of the request in your logs, but you don't have to send the information to the API.</p>
<h2>Summary</h2>
<p>Including the information about where in your code each API request is called helps you trace the location of the request. This is invaluable when you're debugging your app and need to locate the specific API request in your codebase. By automatically adding the information about the location of the request to requests, you minimize the impact on your development process and ensure that the information is consistently applied across the whole app.</p>
</description><pubDate>Mon, 24 Jun 2024 11:02:34 GMT</pubDate></item><item><title>Add background color to menu bar icon in macOS</title><link>https://blog.mastykarz.nl/add-background-color-menu-bar-icon-macos/</link><guid isPermaLink="true">https://blog.mastykarz.nl/add-background-color-menu-bar-icon-macos/</guid><description><p><img src="https://blog.mastykarz.nl/assets/images/2024/02/proxystat-proxy-on.png" alt="Add background color to menu bar icon in macOS" class="webfeedsFeaturedVisual" /></p><p>When building apps for macOS, you might want to add a background color to your icon in the menu bar. This is a great way to make your app stand out and make it easier for users to find your app in the menu bar.</p>
<p>To add a background color to your icon in the menu bar, the trick is to find the right view to apply the background color to. Here's how you can do it in Swift.</p>
<p>Start, by getting a reference to the status item. Then, get the status item's button. Configure the image to fit the button.</p>
<pre><code class="language-swift">// get a reference to the status item
let statusItem = NSStatusBar.system.statusItem(withLength: NSStatusItem.squareLength)
// get the status item's button
let button = statusItem.button
// configure the image to fit the button
button?.imageScaling = NSImageScaling.scaleProportionallyUpOrDown
</code></pre>
<p>Next, get the view behind the button and apply the background color to it. Set the image on the button. To align the button with standard macOS buttons such as microphone, camera or screen recording, configure the corners to be rounded.</p>
<pre><code class="language-swift">// this is the view you need to get
let buttonView = button?.superview?.window?.contentView
// set the image on the button
button.image = NSImage(named:NSImage.Name(&quot;ProxyEnabledWhite&quot;))
// set the background color
buttonView.layer?.backgroundColor = NSColor.systemOrange.cgColor
// round the corners
buttonView.layer?.cornerRadius = 4
</code></pre>
<p>If you get the wrong view, the background color will overlay the image. Also, configure the <code>template-rendering-intent</code> of the image to show on the button as <code>original</code>. This will ensure that the image is not affected by the background color and has enough contrast.</p>
<p>Here's the result:</p>
<p><picture>
<source srcset="https://blog.mastykarz.nl/assets/images/2024/02/proxystat-proxy-on.webp" type="image/webp">
<img src="https://blog.mastykarz.nl/assets/images/2024/02/proxystat-proxy-on.png" alt="ProxyStat showing an orange icon when system proxy is configured">
</picture></p>
<p>To clear the background color, change the <code>backgroundColor</code> to <code>NSColor.clear</code>. Change the image configure as a template image so that it automatically adjusts to the system appearance. If you need, you can also choose to display the image as disabled.</p>
<pre><code class="language-swift">buttonView.layer?.backgroundColor = NSColor.clear.cgColor
button.image = NSImage(named:NSImage.Name(&quot;ProxyEnabledBlack&quot;))
button.appearsDisabled = true
</code></pre>
<p>Here's the result:</p>
<p><picture>
<source srcset="https://blog.mastykarz.nl/assets/images/2024/02/proxystat-proxy-off.webp" type="image/webp">
<img src="https://blog.mastykarz.nl/assets/images/2024/02/proxystat-proxy-off.png" alt="ProxyStat showing a greyed out icon when system proxy is not configured">
</picture></p>
<h2>Summary</h2>
<p>Adding a background color to your icon in the menu bar on macOS is a great way to make your app stand out and make it easier for users to find your app in the menu bar. By applying the background color to the right view, you can combine it with your icon and align it with standard macOS buttons such as microphone, camera or screen recording.</p>
</description><pubDate>Sun, 04 Feb 2024 11:33:54 GMT</pubDate></item><item><title>Easily see if you have system proxy configured on macOS</title><link>https://blog.mastykarz.nl/easily-see-system-proxy-configured-macos/</link><guid isPermaLink="true">https://blog.mastykarz.nl/easily-see-system-proxy-configured-macos/</guid><description><p><img src="https://blog.mastykarz.nl/assets/images/2024/02/proxystat-proxy-on.png" alt="Easily see if you have system proxy configured on macOS" class="webfeedsFeaturedVisual" /></p><p>ProxyStat is a macOS utility that shows when you have a system proxy configured. When you enable a system proxy, the ProxyStat icon the menu bar turns orange.</p>
<p><picture>
<source srcset="https://blog.mastykarz.nl/assets/images/2024/02/proxystat-proxy-on.webp" type="image/webp">
<img src="https://blog.mastykarz.nl/assets/images/2024/02/proxystat-proxy-on.png" alt="ProxyStat showing an orange icon when system proxy is configured">
</picture></p>
<p>When you don't have a system proxy configured, the icon is greyed out.</p>
<p><picture>
<source srcset="https://blog.mastykarz.nl/assets/images/2024/02/proxystat-proxy-off.webp" type="image/webp">
<img src="https://blog.mastykarz.nl/assets/images/2024/02/proxystat-proxy-off.png" alt="ProxyStat showing a greyed out icon when system proxy is not configured">
</picture></p>
<p>When using tools like <a href="https://aka.ms/devproxy">Dev Proxy</a>, Charles or Proxyman, it's easy to forget to turn off them off in your system proxy configuration. It can be frustrating to understand why your internet is not working and you can't open any website. ProxyStat is a great way to quickly see if you have a system proxy configured.</p>
<p><a href="https://mastykarzblog.blob.core.windows.net/downloads/ProxyStat.pkg">Download ProxyStat</a> for macOS 14.2 and configure it to automatically launch on startup.</p>
</description><pubDate>Sun, 04 Feb 2024 11:11:05 GMT</pubDate></item></channel></rss>

If you would like to create a banner that links to this page (i.e. this validation result), do the following:

Download the "valid RSS" banner.
Upload the image to your own server. (This step is important. Please do not link directly to the image on this server.)
Add this HTML to your page (change the image src attribute if necessary):

<a href="http://www.feedvalidator.org/check.cgi?url=http%3A//feeds.feedburner.com/WaldekMastykarz"><img src="valid-rss-rogers.png" alt="[Valid RSS]" title="Validate my RSS feed" /></a>

If you would like to create a text link instead, here is the URL you can use:

http://www.feedvalidator.org/check.cgi?url=http%3A//feeds.feedburner.com/WaldekMastykarz

Home · About · News · Docs · Terms