AI News, Data Tagging in Medical Imaging – An overview

Data Tagging in Medical Imaging – An overview

Today, as we are witnessing the era of smart AI-driven solutions which are empowering humans to automate tasks that require a certain level of cognition.

As we, humans, tend to learn through various experiences throughout our life, machines learn and automate tasks based on the data fed to them.

From our experience in developing vertical agnostic AI-first products, we are well aware of the importance of the availability of quality data and subsequently developing a smart data tagging process.

In this series of blog posts, we’re going to talk about the importance of data tagging in medical imaging, where we are developing computer vision technologies to better assist doctors.

Determining the exact list of tags, levels of tagging and properties of each of these tags is extremely critical and, if done properly, can save you a lot of money and time.

For example, to build a dataset of subtypes of hemorrhage, one needs to make a list of all types of intracranial hemorrhage such as subarachnoid hemorrhage and subdural hemorrhage.

For example, intraparenchymal hemorrhage can be easily tagged by making an outline, whereas a pathology like cerebral atrophy is nearly impossible to tag by making a mere outline.

Each aspect such as data flow, quality control and training processes is a broader topic of discussion, we will be discussing them in detail in the next few blogs.

Develop a quality assurance and quality control plan

Just as data checking and review are important components of data management, so is the step of documenting how these tasks were accomplished.

A helpful approach to documenting data checking and review (often called Quality Assurance, Quality Control, or QA/QC) is to list the actions taken to evaluate the data, how decisions were made regarding problem resolution, and what actions were taken to resolve the problems at each step in the data life cycle.

Quality control and assurance should include: For instance, a researcher may graph a list of particular observations and look for outliers, return to the original data source to confirm suspicions about certain values, and then make a change to the live dataset.


In the 2000s, as digital formats were becoming the prevalent way of storing data and information, metadata was also used to describe digital data using metadata standards.

The first description of 'meta data' for computer systems is purportedly noted by MIT's Center for International Studies experts David Griffel and Stuart McIntosh in 1967: 'In summary then, we have statements in an object language about subject descriptions of data and token codes for the data.

For example, a web page may include metadata specifying what software language the page is written in (e.g., HTML), what tools were used to create it, what subjects the page is about, and where to find more information about the subject.

Metadata assists users in resource discovery by 'allowing resources to be found by relevant criteria, identifying resources, bringing similar resources together, distinguishing dissimilar resources, and giving location information.'[9]

In many countries, the metadata relating to emails, telephone calls, web pages, video traffic, IP connections and cell phone locations are routinely stored by government organizations.[11]

For example, a digital image may include metadata that describes how large the picture is, the color depth, the image resolution, when the image was created, the shutter speed, and other data.[13]

For example: by itself, a database containing several numbers, all 13 digits long could be the results of calculations or a list of numbers to plug into an equation - without any other context, the numbers themselves can be perceived as the data.

But if given the context that this database is a log of a book collection, those 13-digit numbers may now be identified as ISBNs - information that refers to the book, but is not itself the information within the book.

but also what statistical processes were used to create the data, which is of particular importance to the statistical community in order to both validate and improve the process of statistical data production[6].

however, advances in universal design have raised its profile.[21]:213-214 Projects like Cloud4All and GPII identified the lack of common terminologies and models to describe the needs and preferences of users and information that fits those needs as a major gap in providing universal access solutions.[21]:210-211.

While the efforts to describe and standardize the varied accessibility needs of information seekers are beginning to become more robust their adoption into established metadata schemas has not been as developed.

For example, while Dublin Core (DC)'s “audience” and MARC 21's “reading level” could be used to identify resources suitable for users with dyslexia and DC's “Format” could be used to identify resources available in braille, audio, or large print formats, there is more work to be done.[21]:214

Metadata (metacontent) or, more correctly, the vocabularies used to assemble metadata (metacontent) statements, is typically structured according to a standardized concept using a well-defined metadata scheme, including: metadata standards and metadata models.

Using controlled vocabularies for the components of metacontent statements, whether for indexing or finding, is endorsed by ISO 25964: 'If both the indexer and the searcher are guided to choose the same term for the same concept, then relevant documents will be retrieved.'[24]

In all cases where the metadata schemata exceed the planar depiction, some type of hypermapping is required to enable display and view of metadata according to chosen aspect and to serve special views.

In ISO/IEC 11179 Part-3, the information objects are data about Data Elements, Value Domains, and other reusable semantic and representational information objects that describe the meaning and technical details of a data item.

ISO/IEC 11179 Part 3 also has provisions for describing compound structures that are derivations of other data elements, for example through calculations, collections of one or more data elements, or other forms of derived data.

While this standard describes itself originally as a 'data element' registry, its purpose is to support describing and registering metadata content independently of any particular application, lending the descriptions to being discovered and reused by humans or computers in developing new applications, databases, or for analysis of data collected in accordance with the registered metadata content.

Although not a standard, Microformat (also mentioned in the section metadata on the internet below) is a web-based approach to semantic markup which seeks to re-use existing HTML/XHTML tags to convey metadata.

Metadata may be written into a digital photo file that will identify who owns it, copyright and contact information, what brand or model of camera created the file, along with exposure information (shutter speed, f-stop, etc.) and descriptive information, such as keywords about the photo, making the file or image searchable on a computer and/or the Internet.

Information on the times, origins and destinations of phone calls, electronic messages, instant messages and other modes of telecommunication, as opposed to message content, is another form of metadata.

Bulk collection of this call detail record metadata by intelligence agencies has proven controversial after disclosures by Edward Snowden of the fact that certain Intelligence agencies such as the NSA had been (and perhaps still are) keeping online metadata on millions of internet user for up to a year, regardless of whether or not they [ever] were persons of interest to the agency.

Metadata is particularly useful in video, where information about its contents (such as transcripts of conversations and text descriptions of its scenes) is not directly understandable by a computer, but where efficient search of the content is desirable.

There are two sources in which video metadata is derived: (1) operational gathered metadata, that is information about the content produced, such as the type of equipment, software, date, and location;

(2) human-authored metadata, to improve search engine visibility, discoverability, audience engagement, and providing advertising opportunities to video publishers.[38]

Until the 1980s, many library catalogues used 3x5 inch cards in file drawers to display a book's title, author, subject matter, and an abbreviated alpha-numeric string (call number) which indicated the physical location of the book within the library's shelves.

More recent and specialized instances of library metadata include the establishment of digital libraries including e-print repositories and digital image libraries.

While often based on library principles, the focus on non-librarian use, especially in providing metadata, means they do not follow traditional or common cataloging approaches.

Metadata in a museum context is the information that trained cultural documentation specialists, such as archivists, librarians, museum registrars and curators, create to index, structure, describe, identify, or otherwise specify works of art, architecture, cultural objects and their images.[49][50][page needed][51][page needed]

Many museums and cultural heritage centers recognize that given the diversity of art works and cultural objects, no single model or standard suffices to describe and catalogue cultural works.[49][50][51]

The early stages of standardization in archiving, description and cataloging within the museum community began in the late 1990s with the development of standards such as Categories for the Description of Works of Art (CDWA), Spectrum, CIDOC Conceptual Reference Model (CRM), Cataloging Cultural Objects (CCO) and the CDWA Lite XML schema.[50]

Scholars and professionals in the field note that the 'quickly evolving landscape of standards and technologies' create challenges for cultural documentarians, specifically non-technically trained professionals.[52][page needed]

Relational databases and metadata work to document and describe the complex relationships amongst cultural objects and multi-faceted works of art, as well as between objects and places, people and artistic movements.[50][51]

An object's materiality, function and purpose, as well as the size (e.g., measurements, such as height, width, weight), storage requirements (e.g., climate-controlled environment) and focus of the museum and collection, influence the descriptive depth of the data attributed to the object by cultural documentarians.[51]

The established institutional cataloging practices, goals and expertise of cultural documentarians and database structure also influence the information ascribed to cultural objects, and the ways in which cultural objects are categorized.[49][51]

In the 2000s, as more museums have adopted archival standards and created intricate databases, discussions about Linked Data between museum databases have come up in the museum, archival and library science communities.[52]

Document metadata have proven particularly important in legal environments in which litigation has requested metadata, which can include sensitive information detrimental to a certain party in court.

This new law means that both security and policing agencies will be allowed to access up to two years of an individual's metadata, with the aim of making it easier to stop any terrorist attacks and serious crimes from happening.

Data warehouses differ from business intelligence (BI) systems, because BI systems are designed to use data to create reports and analyze the information, to provide strategic guidance to management.[59]

The purpose of a data warehouse is to house standardized, structured, consistent, integrated, correct, 'cleaned' and timely data, extracted from various operational systems in an organization.

The design of structural metadata commonality using a data modeling method such as entity relationship model diagramming is important in any data warehouse development effort.

The HTML format used to define web pages allows for the inclusion of a variety of types of metadata, from basic descriptive text, dates and keywords to further advanced metadata schemes such as the Dublin Core, e-GMS, and AGLS[62]

Microformats allow metadata to be added to on-page data in a way that regular web users do not see, but computers, web crawlers and search engines can readily access.

Many search engines are cautious about using metadata in their ranking algorithms due to exploitation of metadata and the practice of search engine optimization, SEO, to improve rankings.

are not executing care and diligence when creating their own metadata and that metadata is part of a competitive environment where the metadata is used to promote the metadata creators own purposes.

Metadata that describes geographic objects in electronic storage or format (such as datasets, maps, features, or documents with a geospatial component) has a history dating back to at least 1994 (refer MIT Library page on FGDC Metadata).

This typically means which organization or institution collected the data, what type of data, which date(s) the data was collected, the rationale for the data collection, and the methodology used for the data collection.

When first released in 1982, Compact Discs only contained a Table Of Contents (TOC) with the number of tracks on the disc and their length in samples.[2][dead link][3] Fourteen years later in 1996, a revision of the CD Red Book standard added CD-Text to carry additional metadata.[4] But CD-Text was not widely adopted.

Metadata can be used to name, describe, catalogue and indicate ownership or copyright for a digital audio file, and its presence makes it much easier to locate a specific audio file within a group, typically through use of a search engine that accesses the metadata.

Unfortunately, the web is littered with unscrupulous websites whose business and traffic models depend on plucking content from other sites and re-using it (sometimes in strangely modified ways) on their own domains.

So, by including links back to your site, and to the specific post you've authored, you can ensure that the search engines see most of the copies linking back to you (indicating that your source is probably the originator).

Many times, you can ignore this problem: but if it gets very severe, and you find the scrapers taking away your rankings and traffic, you might consider using a legal process called a DMCA takedown.

Quality assurance: Importance of systems and standard operating procedures

Quality control is focused on fulfilling quality requirements, and as related to clinical trials, it encompasses the operational techniques and activities undertaken within the quality assurance system to verify that the requirements for quality of the trial-related activities have been fulfilled.[1] Quality assurance, on the other hand, is focused on providing confidence that quality requirements are fulfilled.

As related to clinical trials, it includes all those planned and systemic actions that are established to ensure that the trial is performed and the data are generated, documented (recorded), and reported in compliance with GCP and the applicable regulatory requirements.[1] Quality control is generally the responsibility of the operational units and quality is infused into the outputs and verified as they are being generated.

The quality assurance department must operate independently from the operational units and it must regularly perform quality review activities (self-inspection audits/internal audits) to ensure compliance within operational units with Company quality standards, good working practices [GxPs: current Good Manufacturing Practice (cGMP), Good Laboratory Practice (GLP), GCP, etc.], and local, national, regional and international legal, ethical and regulatory requirements.

Transport analysis guidance: WebTAG

This overview provides general introductory information on the role of transport modelling and appraisal, and how the transport appraisal process supports the development of investment decisions to support a business case.

It provides more detailed knowledge on the key components of the transport appraisal process – including options development analyses and appraisal – describing how the concepts of transparency and proportionality should be applied.

This guidance gives a more detailed description of the transport appraisal process, including more detail on the 3-stage process of option development, further appraisal and implementation/evaluation.

This guidance provides an overview of the department’s process for updating guidance, when guidance is intended to become definitive and how analysts should go about adopting changes to guidance in a proportionate manner.

These guidance documents give advice on the principles of cost-benefit analysis in transport appraisal, the estimation of scheme costs and the calculation of direct impacts on transport users and providers.

The environmental impacts covered in this manual are noise, air quality, greenhouse gases, landscape, townscape, the historic environment, biodiversity and the water environment.

The guidance discusses the relationship between environmental impact appraisal (as set out in this unit) and environmental impact assessment and the need to tailor the level of appraisal to the stage of development of the proposal.

Guidance is provided for active modes (eg walking and cycling), aviation, rail and highway interventions and on the use of marginal external congestion costs to estimate decongestion benefits resulting from mode switch away from car use.

Worksheets used to present evidence for each impact: This guidance informs practitioners of best practice in transport models that provide evidence for use in the appraisal of transport schemes and policies.

It also identifies the main sources of transport data that are available to practitioners developing transport models and address the methods used for gathering data including survey methodology.

These techniques include modelling the impacts of parking policies and park and ride schemes and incorporating the impacts of smarter choice initiatives in existing modelling tools.

Qualitative analysis of interview data: A step-by-step guide

The content applies to qualitative data analysis in general. Do not forget to share this Youtube link with your friends. The steps are also described in writing ...

The Most Important Process to Increase High-Quality Traffic to Your Website

Trying to grow your business online? Don't skip this step. ▻ Subscribe to My Channel Here: ..

Fundamentals of Qualitative Research Methods: Developing a Qualitative Research Question (Module 2)

Qualitative research is a strategy for systematic collection, organization, and interpretation of phenomena that are difficult to measure quantitatively. Dr. Leslie ...

ÄKTA™ pure protein purification system: Controlled by UNICORN 6

Find out more about ÄKTA pure on The chromatography system ÄKTA pure is controlled by UNICORN 6 software that is ..

How to write a good essay

How to write an essay- brief essays and use the principles to expand to longer essays/ even a thesis you might also wish to check the video on Interview ...

The power of believing that you can improve | Carol Dweck

Carol Dweck researches “growth mindset” — the idea that we can grow our brain's capacity to learn and to solve problems. In this talk, she describes two ways to ...

Want to sound like a leader? Start by saying your name right | Laura Sicola | TEDxPenn

Never miss a talk! SUBSCRIBE to the TEDx channel: How do we sound credible? Dr. Sicola ( .

Your brain on video games | Daphne Bavelier

How do fast-paced video games affect the brain? Step into the lab with cognitive researcher Daphne Bavelier to hear surprising news about how video games, ...

There's more to life than being happy | Emily Esfahani Smith

Our culture is obsessed with happiness, but what if there's a more fulfilling path? Happiness comes and goes, says writer Emily Esfahani Smith, but having ...

How great leaders inspire action | Simon Sinek

Simon Sinek presents a simple but powerful model for how leaders inspire action, starting with a golden circle and the question "Why