Bash for Beginners

According to Stack Overflow 2022 Developer Survey, Bash is one of the top 10 most popular technologies. This shouldn’t come as a surprise, given the popularity of using Linux systems with the Bash shell readily installed, across many tech stacks and the cloud. On Azure, more than 50 percent of virtual machine (VM) cores run on Linux. Read More >>>>

Local-first machine learning

As machine learning usage continues to permeate across industries, we see broadening diversity in deployment targets, with companies choosing to run locally on-client versus cloud-based services for security, performance, and cost reasons. On-device machine learning model serving is a difficult task, especially given the limited bandwidth of early-stage startups. This guest post from the team at Read More >>>>

Improve BERT inference

Today’s best-performing language processing models use huge neural architectures with hundreds of millions of parameters. State-of-the-art transformer-based architectures like BERT are available as pretrained models for anyone to use for any language task. The big models have outstanding accuracy, but they are difficult to use in practice. These models are resource hungry due to a Read More >>>>

PyTorch on Azure

Deep learning models are everywhere without us even realizing it. The number of AI use cases have been increasing exponentially with the rapid development of new algorithms, cheaper compute, and greater access to data. Almost every industry has deep learning applications, from healthcare to education to manufacturing, construction, and beyond. Many developers opt to use Read More >>>>

Secure deployments of eBPF programs

The eBPF for Windows runtime has introduced a new mode of operation, native code generation, which exists alongside the currently supported modes of operation for eBPF programs: JIT (just-in-time compilation) and an interpreter, with the administrator able to select the mode when a program is loaded. The native code generation mode involves loading Windows drivers Read More >>>>

What is the ONNX Model Zoo?

Choosing which machine learning model to use, sharing a model with a colleague, and quickly trying out a model are all reasons why you may find yourself wanting to quickly run inference on a model. You can configure your environment and download Jupyter notebooks, but it would be nicer if there was a way to Read More >>>>

Optimizing and deploying transformer INT8 inference

Transformer-based models have revolutionized the natural language processing (NLP) domain. Ever since its inception, transformer architecture has been integrated into models like Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-trained Transformer (GPT) for performing tasks such as text generation or summarization and question and answering to name a few. The newer models are getting Read More >>>>

PyTorch inference

Scale, performance, and efficient deployment of state-of-the-art Deep Learning models are ubiquitous challenges as applied machine learning grows across the industry. We’re happy to see that the ONNX Runtime Machine Learning model inferencing solution we’ve built and use in high-volume Microsoft products and services also resonates with our open source community, enabling new capabilities that drive content Read More >>>>


In recent years, large-scale deep learning models have demonstrated impressive capabilities, excelling at tasks across natural language processing, computer vision, and speech domains. Companies now use these models to power novel AI-driven user experiences across a whole spectrum of applications and industries. However, efficiently training large models with 10’s or 100’s of billions of parameters Read More >>>>

Linux based eBPF programs

In our previous blog, we spoke about the progress we have made for the eBPF for Windows project. A key goal for us has been to meet developers where they are. As a result, enabling eBPF programs written for Linux to run on top of the eBPF for Windows platform is very important to us. In this update, Read More >>>>


Securing the software supply chain and verifying that chain is hard for any software, and containers running in Kubernetes are no exception. Operational best practices like image signing, scanning, provenance verification, and ensuring these operations have been properly completed with signed software bill of materials (SBoMs) are all required, and tons of tools are appearing in order Read More >>>>

ONNX Runtime Web

We are introducing ONNX Runtime Web (ORT Web), a new feature in ONNX Runtime to enable JavaScript developers to run and deploy machine learning models in browsers. It also helps enable new classes of on-device computation. ORT Web will be replacing the soon to be deprecated onnx.js, with improvements such as a more consistent developer experience between Read More >>>>