What is ONNX Runtime

Patient Tools

Read, save, and share this guide

Use these quick tools to make this medical article easier to read, print, save, or share with a family member.

Patient Mode

Understand this article easily

Switch between simple English and easy Bangla patient notes. This is for education and does not replace a doctor consultation.

The performance improvements provided by ONNX Runtime powered by Intel® Deep Learning Boost: Vector Neural Network Instructions (Intel® DL Boost: VNNI) greatly improves performance of machine learning model execution for developers. In the past, machine learning models mostly relied on 32-bit floating point instructions using...

For severe symptoms, danger signs, pregnancy, child illness, or sudden worsening, seek urgent medical care.

বাংলা রোগী নোট এখনো যোগ করা হয়নি। পোস্ট এডিটরে “RX Bangla Patient Mode” বক্স থেকে সহজ বাংলা সারাংশ যোগ করুন।

এই তথ্য শিক্ষা ও সচেতনতার জন্য। এটি ডাক্তারি পরীক্ষা, রোগ নির্ণয় বা প্রেসক্রিপশনের বিকল্প নয়।

Article Summary

The performance improvements provided by ONNX Runtime powered by Intel® Deep Learning Boost: Vector Neural Network Instructions (Intel® DL Boost: VNNI) greatly improves performance of machine learning model execution for developers. In the past, machine learning models mostly relied on 32-bit floating point instructions using AVX512. Now, machine learning models can use 8-bit integer instructions (Intel® DL Boost: VNNI) to achieve substantial speed increases without...

Key Takeaways

  • This article explains What is ONNX Runtime? in simple medical language.
  • This article explains What is BERT? in simple medical language.
  • This article explains What is Deep Learning Boost: VNNI? in simple medical language.
  • This article explains Steps to build and execute ONNX Runtime for Windows 10 on 11th Gen Intel® Core™ Processors in simple medical language.
Educational health guideWritten for patient understanding and clinical awareness.
Reviewed content workflowUse writer and reviewer profiles for stronger trust.
Emergency safety firstUrgent warning signs are highlighted below.

Seek urgent medical care if you notice

These warning signs are general safety guidance. Local emergency numbers and clinical judgment should always come first.

  • Severe symptoms, breathing difficulty, fainting, confusion, or rapidly worsening illness.
  • New weakness, severe pain, high fever, or symptoms after a serious injury.
  • Any symptom that feels urgent, unusual, or unsafe for the patient.
1

Emergency now

Use emergency care for severe, sudden, rapidly worsening, or life-threatening symptoms.

2

See a doctor

Book a professional medical evaluation if symptoms persist, worsen, recur often, affect daily activities, or occur in a high-risk patient.

3

Learn safely

Use this article to understand possible causes, tests, treatment options, prevention, and questions to ask your clinician.

Before reading

RX Patient Tools

Use these quick guides before reading the article, or return to them when you need help preparing questions for a doctor.

Start here Choose the right pathway for symptoms, reports, medicines, or urgent warning signs. Disease article roadmap Read this topic step by step: meaning, symptoms, warning signs, diagnosis, treatment, prevention, and follow-up. Treatment planner Prepare questions about treatment choices, benefits, risks, side effects, and follow-up. Family & caregiver guide Organize symptoms, reports, medicines, questions, and follow-up safely. Nutrition & diet guide Prepare food, hydration, supplement, and medicine-timing questions safely. Prevention guide Organize risk factors, protective habits, screening, and warning signs. Recovery guide Prepare a safe plan for activity, rehabilitation, warning signs, and follow-up.

The performance improvements provided by ONNX Runtime powered by Intel® Deep Learning Boost: Vector Neural Network Instructions (Intel® DL Boost: VNNI) greatly improves performance of machine learning model execution for developers. In the past, machine learning models mostly relied on 32-bit floating point instructions using AVX512. Now, machine learning models can use 8-bit integer instructions (Intel® DL Boost: VNNI) to achieve substantial speed increases without significant loss of accuracy. To fully understand these performance improvements, you must first understand ONNX Runtime, Bi-Directional Encoder Representations from Transformers (BERT), Intel DL Boost: VNNI, and steps to achieve the best performance with ONNX Runtime on Intel platforms. Keep reading to learn more about accelerating BERT model inference with ONNX Runtime and Intel® DL Boost: VNNI.

What is ONNX Runtime?

ONNX Runtime is an open-source project that is designed to accelerate machine learning across a wide range of frameworks, operating systems, and hardware platforms. It enables acceleration of machine learning inferencing across all of your deployment targets using a single set of APIs.1Intel has partnered with the Microsoft ONNX Runtime team to add support for Intel® DL Boost and take advantage of microarchitectural improvements, such as non-exclusive caches on the new 11th Gen Intel® Core™ processors to significantly improve performance. Read more to learn how to achieve the best performance using Intel® Deep Learning Boost: VNNI on ONNX Runtime’s default CPU backend (Microsoft Linear Algebra Subroutine (MLAS)).

What is ONNX Runtime

Figure 1: ONNX Runtime Architecture

What is BERT?

BERT was originally created and published in 2018 by Jacob Devlin and his colleagues at Google. It’s a machine learning technique that greatly improves machine natural language processing (NLP) capabilities. This technique does not process individual words (as previously done), but instead, it processes complete sentences. Machine learning models can now understand the relationship between words within a sentence and understand the context of a sentence. This approach to neuro-linguistic programming (NLP) has revolutionized language processing tasks such as search, document classification, question answering, sentence similarity, text prediction, and more. BERT class models are widely applied in the industry. Recently techniques such as knowledge distillation and quantization have been successfully applied to BERT, making this model deployable on Windows PCs.

What is Deep Learning Boost: VNNI?

Intel Deep Learning Boost: VNNI is designed to deliver significant deep learning acceleration, as well as power-saving optimizations. A single vector instruction (such as VPDPBUSD) can be used to multiply two 8-bit integers and combining the result into a 32-bit output.

What is ONNX Runtime

Steps to build and execute ONNX Runtime for Windows 10 on 11th Gen Intel® Core™ Processors

Pre-requisites:

Preparing the model:

In the Command Line terminal, open the jupyter notebook:

1
​ jupyter notebook

Once the notebook opens in the browser, run all the cells in notebook and save the quantized INT8 ONNX model on your local machine.

Build ONNXRuntime:

When building ONNX Runtime, developers have the flexibility to choose between OpenMP or ONNX Runtime’s own thread pool implementation. For achieving the best performance on Intel platforms, configure ONNX Runtime with OpenMP and later explicitly define the threading policy for model inference.

In the Command Line terminal:

1
2
3
4
​git clone --recursive https://github.com/Microsoft/ONNXRuntime
cd ONNXRuntime
Install cmake-3.13 or higher from https://cmake.org/download/
.\build.bat --config RelWithDebInfo --build_shared_lib –parallel --use_openmp

Tuning Performance for ONNX Runtime’s Default Execution Provider:

In conditions where threading can be explicit, it is recommended to parallelize threads, binding each thread to separate physical cores. On platforms where hyperthreading is enabled, the recommendation is to skip alternate cores (if the number of threads needs to be less than the number of logical cores). This reduces the overhead of cache thrashing caused by repeated thread swapping between cores.

For Windows, use “start /affinity AA” to keep four threads of ONNX Runtime on physical cores by skipping alternate logical cores. To explicitly fix the number of threads OMP_NUM_THREADS environment variable is used. For example, in the Command Line terminal:

1
2
3
4
set KMP_AFFINITY=granularity=fine,compact,1,0
set OMP_NESTED=0
set OMP_WAIT_POLICY=ACTIVE
set /a OMP_NUM_THREADS=4

Run the quantized model with ONNX Runtime:

When executing the runtime, you need to place a folder in the same directory as the runtime with the input test dataset you want to use. For illustration purposes, we will generate a random test input dataset with the following python script which we will name generate_test_data_set.py:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
import numpy as np
from onnx import numpy_helper
batch_range = [1,2, 4, 8, 16]
for batch in range(len(batch_range)):
              for seq in [20,32,64]:
                             numpy_array = np.random.rand(batch_range[batch],seq).astype(np.int64)
                             tensor = numpy_helper.from_array(numpy_array)
                             name = "input_0_" + str(batch_range[batch]) +"_"+ str(seq) + ".pb"
                             f = open(name, "wb")
                             f.write(tensor.SerializeToString())
                             f.close()
                             name = "input_1_" + str(batch_range[batch]) +"_"+ str(seq) + ".pb"
                             f = open(name, "wb")
                             f.write(tensor.SerializeToString())
                             f.close()
                             name = "input_2_" + str(batch_range[batch]) +"_"+ str(seq) + ".pb"
                             f = open(name, "wb")
                             f.write(tensor.SerializeToString())
                             f.close()
                             print (name)

In the Command Line terminal:

1
​ python generate_test_data_set.py

This will generate test data set for three inputs for BERT base:

  • input_0_<batch_size>_<seqLength>.pb
  • input_1_<batch_size>_<seqLength>.pb
  • input_2_<batch_size>_<seqLength>.pb

Create a new folder ‘test_data_set_0’ folder in the same location as the ONNX model Files. Make sure no other folder exists in the same location. Copy the three inputs of the SAME sequence and batch length to the test_data_set_0 folder.

In the test_data_set_0 folder, rename

  • input_0_<batch_size>_<seqLength>.pb to input_0.pb
  • input_1_<batch_size>_<seqLength>.pb to input_1.pb
  • input_2_<batch_size>_<seqLength>.pb to input_2.pb

Now run ONNX Runtime. In the Command Line terminal:

1
2
cd&amp;lt;root&amp;gt;\onnxruntime\build\Windows\RelWithDebInfo\RelWithDebInfo
onnxruntime_perf_test.exe -m times -r&amp;lt;#iterations&amp;gt;-o 99 -e cpu MODEL_NAME.onnx

Repeat steps for the next set of batch and seq lengths.

Get extensive details about ONNX Runtime inference.

Doctor visit helper

Prepare before seeing a doctor

A simple rural-patient checklist to help you explain symptoms clearly, ask better questions, and avoid unsafe self-treatment.

Safety note: This is not a prescription or diagnosis. For severe symptoms, pregnancy danger signs, children with serious illness, chest pain, breathing difficulty, stroke-like weakness, or major injury, seek urgent care.

Which doctor may help?

Start with a registered doctor or the nearest qualified health center.

What to tell the doctor

  • Write when the problem started and how it changed.
  • Bring old prescriptions, investigation reports, and current medicines.
  • Write allergies, pregnancy status, diabetes, kidney/liver disease, and major past illnesses.
  • Bring one family member if the patient is weak, elderly, confused, or a child.

Questions to ask

  • What is the most likely cause of my symptoms?
  • Which danger signs mean I should go to hospital quickly?
  • Which tests are necessary now, and which can wait?
  • How should I take medicines safely and what side effects should I watch for?
  • When should I come for follow-up?

Tests to discuss

  • Vital signs: temperature, pulse, blood pressure, oxygen saturation
  • Basic physical examination by a clinician
  • CBC, urine test, blood sugar, or imaging only when clinically needed

Avoid these mistakes

  • Do not use antibiotics, steroid tablets/injections, or strong painkillers without proper medical advice.
  • Do not hide pregnancy, kidney disease, ulcer, allergy, or blood thinner use.
  • Do not delay emergency care when danger signs are present.

Medicine safety and first-aid guide

This section is for patient education only. It does not replace a doctor, pharmacist, or emergency care.

Safe first steps

  • Rest, drink safe water, and observe symptoms carefully.
  • Keep a written note of symptoms, duration, temperature, medicines already taken, and allergy history.
  • Seek medical care quickly if symptoms are severe, worsening, or unusual for the patient.

OTC medicine safety

  • For mild pain or fever, ask a registered pharmacist or doctor before using common over-the-counter pain/fever medicines.
  • Do not combine multiple pain medicines without advice, especially if you have kidney disease, liver disease, stomach ulcer, asthma, pregnancy, or take blood thinners.
  • Do not give adult medicines to children unless a qualified clinician advises it.

Avoid these mistakes

  • Do not start antibiotics without a proper medical decision.
  • Do not use steroid tablets or injections casually for quick relief.
  • Do not delay emergency care because of home remedies.

Get urgent help if

  • Severe symptoms, confusion, fainting, breathing difficulty, chest pain, severe dehydration, or sudden weakness need urgent medical care.
Medicine names, dose, and timing must be decided by a qualified clinician or pharmacist after checking age, pregnancy, allergy, other diseases, and current medicines.

For rural patients and family caregivers

Patient health record and symptom diary

Write your symptoms, medicines already taken, test results, and questions before visiting a doctor. This note stays on your device unless you print or copy it.

Doctor to discuss: Doctor / qualified healthcare provider
Tests to discuss with doctor
  • Basic vital signs: temperature, pulse, blood pressure, oxygen level if needed
  • Relevant blood, urine, imaging, or specialist tests only after clinical assessment
Questions to ask
  • What is the most likely cause of my symptoms?
  • Which warning signs mean I should go to emergency care?
  • Which tests are really needed now?
  • Which medicines are safe for my age, pregnancy status, allergy, kidney/liver/stomach condition, and current medicines?

Emergency warning signs such as chest pain, severe breathing difficulty, sudden weakness, confusion, severe dehydration, major injury, or loss of bladder/bowel control need urgent medical care. Do not wait for online information.

Safe pathway to proper treatment

Care roadmap for: What is ONNX Runtime

Use this simple roadmap to understand the next safe steps. It is educational and does not replace examination by a doctor.

Go to emergency care if you notice:
  • Severe or rapidly worsening symptoms
  • Breathing difficulty, chest pain, fainting, confusion, severe weakness, major injury, or severe dehydration
Doctor / service to discuss: Qualified healthcare provider; specialist depends on symptoms and examination.
  1. Step 1

    Check danger signs first

    If danger signs are present, seek emergency care and do not wait for online information.

  2. Step 2

    Record the symptom story

    Write when symptoms started, severity, medicines already taken, allergies, pregnancy status, and test results.

  3. Step 3

    Visit a qualified clinician

    A doctor, nurse, or qualified healthcare provider can examine you and decide which tests or treatment are needed.

  4. Step 4

    Do only useful tests

    Do tests after clinical assessment. Avoid unnecessary tests, random antibiotics, or repeated medicines without diagnosis.

  5. Step 5

    Follow up and return early if worse

    If symptoms worsen, new warning signs appear, or treatment is not helping, return for review quickly.

Rural patient practical tips
  • Take a written symptom diary and all previous prescriptions/test reports.
  • Do not hide medicines already taken, even herbal or over-the-counter medicines.
  • Ask which warning signs mean urgent referral to hospital.

This roadmap is for education. A real diagnosis and treatment plan requires history, examination, and clinical judgment.

RX Patient Help

Ask a health question safely

Write your symptom story. A health professional or site editor can review it before any answer is prepared. This box is not for emergency care.

Emergency first: Severe chest pain, breathing trouble, unconsciousness, stroke signs, severe injury, heavy bleeding, or rapidly worsening symptoms need urgent local medical care now.

Frequently Asked Questions

What is ONNX Runtime?

ONNX Runtime is an open-source project that is designed to accelerate machine learning across a wide range of frameworks, operating systems, and hardware platforms. It enables acceleration of machine learning inferencing across all of your deployment targets using a single set of APIs.1Intel has partnered with the Microsoft ONNX Runtime team to add support for Intel® DL Boost and take advantage of microarchitectural improvements, such as non-exclusive caches on the new 11th Gen Intel® Core™ processors to significantly improve performance. Read…

What is BERT?

BERT was originally created and published in 2018 by Jacob Devlin and his colleagues at Google. It’s a machine learning technique that greatly improves machine natural language processing (NLP) capabilities. This technique does not process individual words (as previously done), but instead, it processes complete sentences. Machine learning models can now understand the relationship between words within a sentence and understand the context of a sentence. This approach to neuro-linguistic programming (NLP) has revolutionized language processing tasks such as search,…

What is Deep Learning Boost: VNNI?

Intel Deep Learning Boost: VNNI is designed to deliver significant deep learning acceleration, as well as power-saving optimizations. A single vector instruction (such as VPDPBUSD) can be used to multiply two 8-bit integers and combining the result into a 32-bit output.

References

Add references, clinical guidelines, textbooks, journal articles, or trusted medical sources here. You can edit this area from the RX Article Professional Blocks panel.