ONNX Runtime Web

Patient Tools

Read, save, and share this guide

Use these quick tools to make this medical article easier to read, print, save, or share with a family member.

Patient Mode

Understand this article easily

Switch between simple English and easy Bangla patient notes. This is for education and does not replace a doctor consultation.

We are introducing ONNX Runtime Web (ORT Web), a new feature in ONNX Runtime to enable JavaScript developers to run and deploy machine learning models in browsers. It also helps enable new classes of on-device computation. ORT Web will be replacing the soon to be deprecated onnx.js,...

For severe symptoms, danger signs, pregnancy, child illness, or sudden worsening, seek urgent medical care.

বাংলা রোগী নোট এখনো যোগ করা হয়নি। পোস্ট এডিটরে “RX Bangla Patient Mode” বক্স থেকে সহজ বাংলা সারাংশ যোগ করুন।

এই তথ্য শিক্ষা ও সচেতনতার জন্য। এটি ডাক্তারি পরীক্ষা, রোগ নির্ণয় বা প্রেসক্রিপশনের বিকল্প নয়।

Article Summary

We are introducing ONNX Runtime Web (ORT Web), a new feature in ONNX Runtime to enable JavaScript developers to run and deploy machine learning models in browsers. It also helps enable new classes of on-device computation. ORT Web will be replacing the soon to be deprecated onnx.js, with improvements such as a more consistent developer experience between packages for server-side and client-side inferencing and improved inference performance...

Key Takeaways

  • This article explains A glance at ONNX Runtime (ORT) in simple medical language.
  • This article explains In-browser inference with ORT Web in simple medical language.
  • This article explains Get started in simple medical language.
  • This article explains Looking forward in simple medical language.
Educational health guideWritten for patient understanding and clinical awareness.
Reviewed content workflowUse writer and reviewer profiles for stronger trust.
Emergency safety firstUrgent warning signs are highlighted below.

Seek urgent medical care if you notice

These warning signs are general safety guidance. Local emergency numbers and clinical judgment should always come first.

  • Severe symptoms, breathing difficulty, fainting, confusion, or rapidly worsening illness.
  • New weakness, severe pain, high fever, or symptoms after a serious injury.
  • Any symptom that feels urgent, unusual, or unsafe for the patient.
1

Emergency now

Use emergency care for severe, sudden, rapidly worsening, or life-threatening symptoms.

2

See a doctor

Book a professional medical evaluation if symptoms persist, worsen, recur often, affect daily activities, or occur in a high-risk patient.

3

Learn safely

Use this article to understand possible causes, tests, treatment options, prevention, and questions to ask your clinician.

Before reading

RX Patient Tools

Use these quick guides before reading the article, or return to them when you need help preparing questions for a doctor.

Start here Choose the right pathway for symptoms, reports, medicines, or urgent warning signs. Disease article roadmap Read this topic step by step: meaning, symptoms, warning signs, diagnosis, treatment, prevention, and follow-up. Treatment planner Prepare questions about treatment choices, benefits, risks, side effects, and follow-up. Family & caregiver guide Organize symptoms, reports, medicines, questions, and follow-up safely. Nutrition & diet guide Prepare food, hydration, supplement, and medicine-timing questions safely. Prevention guide Organize risk factors, protective habits, screening, and warning signs. Recovery guide Prepare a safe plan for activity, rehabilitation, warning signs, and follow-up.

We are introducing ONNX Runtime Web (ORT Web), a new feature in ONNX Runtime to enable JavaScript developers to run and deploy machine learning models in browsers. It also helps enable new classes of on-device computation. ORT Web will be replacing the soon to be deprecated onnx.js, with improvements such as a more consistent developer experience between packages for server-side and client-side inferencing and improved inference performance and model coverage. This blog gives you a quick overview of ORT Web, as well as getting started resources for trying it out.

A glance at ONNX Runtime (ORT)

ONNX Runtime is a high-performance cross-platform inference engine to run all kinds of machine learning models. It supports all the most popular training frameworks including TensorFlow, PyTorch, SciKit Learn, and more. ONNX Runtime aims to provide an easy-to-use experience for AI developers to run models on various hardware and software platforms. Beyond accelerating server-side inference, ONNX Runtime for Mobile is available since ONNX Runtime 1.5. Now ORT Web is a new offering with the ONNX Runtime 1.8 release, focusing on in-browser inference.

In-browser inference with ORT Web

Running machine-learning-powered web applications in browsers has drawn a lot of attention from the AI community. It is challenging to make native AI applications portable to multiple platforms given the variations in programming languages and deployment environments. Web applications can easily enable cross-platform portability with the same implementation through the browser. Additionally, running machine learning models in browsers can accelerate performance by reducing server-client communications and simplify the distribution experience without needing any additional libraries and driver installations.

How does it work?

ORT Web accelerates model inference in the browser on both CPUs and GPUs, through WebAssembly (WASM) and WebGL backends separately. For CPU inference, ORT Web compiles the native ONNX Runtime CPU engine into the WASM backend by using Emscripten. WebGL is a popular standard for accessing GPU capabilities and adopted by ORT Web for achieving high performance on GPUs.

ONNX Runtime Web

Figure 1: ORT web overview.

WebAssembly (WASM) backend for CPU

WebAssembly allows you to use server-side code on the client-side in the browser. Before WebAssembly only JavaScript was available in the browser. There are some advantages of WebAssembly compared to JavaScript such as faster load time and execution efficiency. Furthermore, WebAssembly supports multi-threading by utilizing SharedArrayBuffer, Web Worker, and SIMD128 (128-bits Single Instruction Multiple Data) to accelerate bulk data processing. This makes WebAssembly an attractive technique to execute the model at near-native speed on the web.

We leverage Emscripten, an open-source compiler toolchain, to compile ONNXRuntime C++ code into WebAssembly so that they can be loaded in browsers. This allows us to reuse the ONNX Runtime core and native CPU engine. By doing that ORT Web WASM backend can run any ONNX model and support most functionalities native ONNX Runtime offers, including full ONNX operator coverage, quantized ONNX model, and mini runtime. We utilize multi-threading and features in WebAssembly to further accelerate model inferencing. Note that SIMD is a new feature and isn’t yet available in all browsers with WebAssembly support. The browsers supporting new WebAssembly features could be found on the webassembly.org website.

During initialization, ORT Web checks the capabilities of the runtime environment to detect whether multi-threading and SIMD features are available. If not, there is a fallback version based on the environment. Taking Mobilenet V2 as an example, the CPU inference performance can be accelerated by 3.4x with two threads together with SIMD enabled, comparing the pure WebAssembly without enabling these two features.

ONNX Runtime Web

Figure 2: 3.4x performance acceleration on CPU with multi-threading and SIMD enabled in WebAssembly (Test machine: Processor Intel(R) Xeon(R) CPU E3-1230 v5 @ 3.40GHz, 3401 Mhz, 4 Core(s), 8 Logical Processor(s)).

WebGL backend for GPU

WebGL is a JavaScript API that conforms to OpenGL ES 2.0 standard, which is supported by all major browsers and on various platforms including Windows, Linux, macOS, Android, and iOS. The GPU backend of ORT Web is built on WebGL and works with a variety of supported environments. This enables users to seamlessly port their deep learning models across different platforms.

In addition to portability, the ORT WebGL backend offers superior inference performance by deploying the following optimizations: pack mode, data cache, code cache, and node fusion. Pack mode reduces up to 75 percent memory footprint while improving parallelism. To avoid creating the same GPU data multiple times, ORT Web reuses as much GPU data (texture) as possible. WebGL uses OpenGL Shading Language (GLSL) to construct shaders to execute GPU programs. However, shaders must be compiled at runtime, introducing unacceptably high overhead. The code cache addresses this issue by ensuring each shader will be compiled only once. WebGL backend is capable of quite a few typical node fusions and has plans to take advantage of the graph optimization infrastructure to support a large collection of graph-based optimizations.

All ONNX operators are supported by the WASM backend but a subset by the WebGL backend. You can get supported operators by each backend. And below are the compatible platforms that each backend supports in ORT Web.

ONNX Runtime Web

Figure 3: Compatible platforms that ORT Web supports.

Get started

In this section, we’ll show you how you can incorporate ORT Web to build machine-learning-powered web applications.

Get an ONNX model

Thanks to the framework interoperability of ONNX, you can convert a model trained in any framework supporting ONNX to ONNX format. Torch.onnx.export is the built-in API in PyTorch for model exporting to ONNX and Tensorflow-ONNX is a standalone tool for TensorFlow and TensorFlow Lite to ONNX model conversion. Also, there are various pre-trained ONNX models covering common scenarios in the ONNX Model Zoo for a quick start.

Inference ONNX model in the browser

There are two ways to use ORT-Web, through a script tag or a bundler. The APIs in ORT Web to score the model are similar to the native ONNX Runtime, first creating an ONNX Runtime inference session with the model and then running the session with input data. By providing a consistent development experience, we aim to save time and effort for developers to integrate ML into applications and services for different platforms through ONNX Runtime.

The following code snippet shows how to call ORT Web API to inference a model with different backends.

const ort = require('onnxruntime-web');

// create an inference session, using WebGL backend. (default is 'wasm')
const session = await ort.InferenceSession.create('./model.onnx', { executionProviders: ['webgl'] });
…
// feed inputs and run
const results = await session.run(feeds);

Figure 4: Code snippet of ORT Web APIs.

Some advanced features can be configured via setting properties of object `ort.env`, such as setting the maximum thread number and enabling/disabling SIMD.

// set maximum thread number for WebAssembly backend. Setting to 1 to disable multi-threads
ort.wasm.numThreads = 1;

// set flag to enable/disable SIMD (default is true)
ort.wasm.simd = false;

Figure 5: Code snippet of properties setting in ORT Web.

Pre- and post-processing needs to be handled in JS before inputs are fed into ORT Web for inference. ORT Web Demo shows several interesting In-Browser vision scenarios powered by image models with ORT Web. You can find the code source including image input processing and inference through ORT Web. Another E2E tutorial is created by the Cloud Advocate curriculum team about building a Cuisine Recommender Web App with ORT Web. It goes through exporting a Scikit-Learn model to ONNX as well as running this model with ORT Web using script tag.

ONNX Runtime Web

Figure 6: A cuisine recommender web app with ORT Web.

Looking forward

We hope this has inspired you to try out ORT Web in your web applications. We would love to hear your suggestions and feedback. You can participate or leave comments in our GitHub repos (ONNX Runtime). We continue to work on and improve the performance, model coverage as well as adding new features. On-device training is another interesting possibility we want to research for ORT Web. Stay tuned for our updates.

Doctor visit helper

Prepare before seeing a doctor

A simple rural-patient checklist to help you explain symptoms clearly, ask better questions, and avoid unsafe self-treatment.

Safety note: This is not a prescription or diagnosis. For severe symptoms, pregnancy danger signs, children with serious illness, chest pain, breathing difficulty, stroke-like weakness, or major injury, seek urgent care.

Which doctor may help?

Start with a registered doctor or the nearest qualified health center.

What to tell the doctor

  • Write when the problem started and how it changed.
  • Bring old prescriptions, investigation reports, and current medicines.
  • Write allergies, pregnancy status, diabetes, kidney/liver disease, and major past illnesses.
  • Bring one family member if the patient is weak, elderly, confused, or a child.

Questions to ask

  • What is the most likely cause of my symptoms?
  • Which danger signs mean I should go to hospital quickly?
  • Which tests are necessary now, and which can wait?
  • How should I take medicines safely and what side effects should I watch for?
  • When should I come for follow-up?

Tests to discuss

  • Vital signs: temperature, pulse, blood pressure, oxygen saturation
  • Basic physical examination by a clinician
  • CBC, urine test, blood sugar, or imaging only when clinically needed

Avoid these mistakes

  • Do not use antibiotics, steroid tablets/injections, or strong painkillers without proper medical advice.
  • Do not hide pregnancy, kidney disease, ulcer, allergy, or blood thinner use.
  • Do not delay emergency care when danger signs are present.

Medicine safety and first-aid guide

This section is for patient education only. It does not replace a doctor, pharmacist, or emergency care.

Safe first steps

  • Rest, drink safe water, and observe symptoms carefully.
  • Keep a written note of symptoms, duration, temperature, medicines already taken, and allergy history.
  • Seek medical care quickly if symptoms are severe, worsening, or unusual for the patient.

OTC medicine safety

  • For mild pain or fever, ask a registered pharmacist or doctor before using common over-the-counter pain/fever medicines.
  • Do not combine multiple pain medicines without advice, especially if you have kidney disease, liver disease, stomach ulcer, asthma, pregnancy, or take blood thinners.
  • Do not give adult medicines to children unless a qualified clinician advises it.

Avoid these mistakes

  • Do not start antibiotics without a proper medical decision.
  • Do not use steroid tablets or injections casually for quick relief.
  • Do not delay emergency care because of home remedies.

Get urgent help if

  • Severe symptoms, confusion, fainting, breathing difficulty, chest pain, severe dehydration, or sudden weakness need urgent medical care.
Medicine names, dose, and timing must be decided by a qualified clinician or pharmacist after checking age, pregnancy, allergy, other diseases, and current medicines.

For rural patients and family caregivers

Patient health record and symptom diary

Write your symptoms, medicines already taken, test results, and questions before visiting a doctor. This note stays on your device unless you print or copy it.

Doctor to discuss: Doctor / qualified healthcare provider
Tests to discuss with doctor
  • Basic vital signs: temperature, pulse, blood pressure, oxygen level if needed
  • Relevant blood, urine, imaging, or specialist tests only after clinical assessment
Questions to ask
  • What is the most likely cause of my symptoms?
  • Which warning signs mean I should go to emergency care?
  • Which tests are really needed now?
  • Which medicines are safe for my age, pregnancy status, allergy, kidney/liver/stomach condition, and current medicines?

Emergency warning signs such as chest pain, severe breathing difficulty, sudden weakness, confusion, severe dehydration, major injury, or loss of bladder/bowel control need urgent medical care. Do not wait for online information.

Safe pathway to proper treatment

Care roadmap for: ONNX Runtime Web

Use this simple roadmap to understand the next safe steps. It is educational and does not replace examination by a doctor.

Go to emergency care if you notice:
  • Severe or rapidly worsening symptoms
  • Breathing difficulty, chest pain, fainting, confusion, severe weakness, major injury, or severe dehydration
Doctor / service to discuss: Qualified healthcare provider; specialist depends on symptoms and examination.
  1. Step 1

    Check danger signs first

    If danger signs are present, seek emergency care and do not wait for online information.

  2. Step 2

    Record the symptom story

    Write when symptoms started, severity, medicines already taken, allergies, pregnancy status, and test results.

  3. Step 3

    Visit a qualified clinician

    A doctor, nurse, or qualified healthcare provider can examine you and decide which tests or treatment are needed.

  4. Step 4

    Do only useful tests

    Do tests after clinical assessment. Avoid unnecessary tests, random antibiotics, or repeated medicines without diagnosis.

  5. Step 5

    Follow up and return early if worse

    If symptoms worsen, new warning signs appear, or treatment is not helping, return for review quickly.

Rural patient practical tips
  • Take a written symptom diary and all previous prescriptions/test reports.
  • Do not hide medicines already taken, even herbal or over-the-counter medicines.
  • Ask which warning signs mean urgent referral to hospital.

This roadmap is for education. A real diagnosis and treatment plan requires history, examination, and clinical judgment.

RX Patient Help

Ask a health question safely

Write your symptom story. A health professional or site editor can review it before any answer is prepared. This box is not for emergency care.

Emergency first: Severe chest pain, breathing trouble, unconsciousness, stroke signs, severe injury, heavy bleeding, or rapidly worsening symptoms need urgent local medical care now.

Frequently Asked Questions

A glance at ONNX Runtime (ORT) ONNX Runtime is a high-performance cross-platform inference engine to run all kinds of machine learning models. It supports all the most popular training frameworks including TensorFlow, PyTorch, SciKit Learn, and more. ONNX Runtime aims to provide an easy-to-use experience for AI developers to run models on various hardware and software platforms. Beyond accelerating server-side inference, ONNX Runtime for Mobile is available since ONNX Runtime 1.5. Now ORT Web is a new offering with the ONNX Runtime 1.8 release, focusing on in-browser inference. In-browser inference with ORT Web Running machine-learning-powered web applications in browsers has drawn a lot of attention from the AI community. It is challenging to make native AI applications portable to multiple platforms given the variations in programming languages and deployment environments. Web applications can easily enable cross-platform portability with the same implementation through the browser. Additionally, running machine learning models in browsers can accelerate performance by reducing server-client communications and simplify the distribution experience without needing any additional libraries and driver installations. How does it work?

ORT Web accelerates model inference in the browser on both CPUs and GPUs, through WebAssembly (WASM) and WebGL backends separately. For CPU inference, ORT Web compiles the native ONNX Runtime CPU engine into the WASM backend by using Emscripten. WebGL is a popular standard for accessing GPU capabilities and adopted by ORT Web for achieving high performance on GPUs. Figure 1: ORT web overview.

References

Add references, clinical guidelines, textbooks, journal articles, or trusted medical sources here. You can edit this area from the RX Article Professional Blocks panel.