Robot Technology News  
ROBO SPACE
Building explainability into the components of machine-learning models
by Adam Zewe for MIT News
Boston MA (SPX) Jul 06, 2022

Researchers develop tools to help data scientists make the features used in machine-learning models more understandable for end users. (MIT stock image)

Explanation methods that help users understand and trust machine-learning models often describe how much certain features used in the model contribute to its prediction. For example, if a model predicts a patient's risk of developing cardiac disease, a physician might want to know how strongly the patient's heart rate data influences that prediction.

But if those features are so complex or convoluted that the user can't understand them, does the explanation method do any good?

MIT researchers are striving to improve the interpretability of features so decision makers will be more comfortable using the outputs of machine-learning models. Drawing on years of field work, they developed a taxonomy to help developers craft features that will be easier for their target audience to understand.

"We found that out in the real world, even though we were using state-of-the-art ways of explaining machine-learning models, there is still a lot of confusion stemming from the features, not from the model itself," says Alexandra Zytek, an electrical engineering and computer science PhD student and lead author of a paper introducing the taxonomy.

To build the taxonomy, the researchers defined properties that make features interpretable for five types of users, from artificial intelligence experts to the people affected by a machine-learning model's prediction. They also offer instructions for how model creators can transform features into formats that will be easier for a layperson to comprehend.

They hope their work will inspire model builders to consider using interpretable features from the beginning of the development process, rather than trying to work backward and focus on explainability after the fact.

MIT co-authors include Dongyu Liu, a postdoc; visiting professor Laure Berti-Equille, research director at IRD; and senior author Kalyan Veeramachaneni, principal research scientist in the Laboratory for Information and Decision Systems (LIDS) and leader of the Data to AI group. They are joined by Ignacio Arnaldo, a principal data scientist at Corelight. The research is published in the June edition of the Association for Computing Machinery Special Interest Group on Knowledge Discovery and Data Mining's peer-reviewed Explorations Newsletter.

Real-world lessons
Features are input variables that are fed to machine-learning models; they are usually drawn from the columns in a dataset. Data scientists typically select and handcraft features for the model, and they mainly focus on ensuring features are developed to improve model accuracy, not on whether a decision-maker can understand them, Veeramachaneni explains.

For several years, he and his team have worked with decision makers to identify machine-learning usability challenges. These domain experts, most of whom lack machine-learning knowledge, often don't trust models because they don't understand the features that influence predictions.

For one project, they partnered with clinicians in a hospital ICU who used machine learning to predict the risk a patient will face complications after cardiac surgery. Some features were presented as aggregated values, like the trend of a patient's heart rate over time. While features coded this way were "model ready" (the model could process the data), clinicians didn't understand how they were computed. They would rather see how these aggregated features relate to original values, so they could identify anomalies in a patient's heart rate, Liu says.

By contrast, a group of learning scientists preferred features that were aggregated. Instead of having a feature like "number of posts a student made on discussion forums" they would rather have related features grouped together and labeled with terms they understood, like "participation."

"With interpretability, one size doesn't fit all. When you go from area to area, there are different needs. And interpretability itself has many levels," Veeramachaneni says.

The idea that one size doesn't fit all is key to the researchers' taxonomy. They define properties that can make features more or less interpretable for different decision makers and outline which properties are likely most important to specific users.

For instance, machine-learning developers might focus on having features that are compatible with the model and predictive, meaning they are expected to improve the model's performance.

On the other hand, decision makers with no machine-learning experience might be better served by features that are human-worded, meaning they are described in a way that is natural for users, and understandable, meaning they refer to real-world metrics users can reason about.

"The taxonomy says, if you are making interpretable features, to what level are they interpretable? You may not need all levels, depending on the type of domain experts you are working with," Zytek says.

Putting interpretability first
The researchers also outline feature engineering techniques a developer can employ to make features more interpretable for a specific audience.

Feature engineering is a process in which data scientists transform data into a format machine-learning models can process, using techniques like aggregating data or normalizing values. Most models also can't process categorical data unless they are converted to a numerical code. These transformations are often nearly impossible for laypeople to unpack.

Creating interpretable features might involve undoing some of that encoding, Zytek says. For instance, a common feature engineering technique organizes spans of data so they all contain the same number of years. To make these features more interpretable, one could group age ranges using human terms, like infant, toddler, child, and teen. Or rather than using a transformed feature like average pulse rate, an interpretable feature might simply be the actual pulse rate data, Liu adds.

"In a lot of domains, the tradeoff between interpretable features and model accuracy is actually very small. When we were working with child welfare screeners, for example, we retrained the model using only features that met our definitions for interpretability, and the performance decrease was almost negligible," Zytek says.

Building off this work, the researchers are developing a system that enables a model developer to handle complicated feature transformations in a more efficient manner, to create human-centered explanations for machine-learning models. This new system will also convert algorithms designed to explain model-ready datasets into formats that can be understood by decision makers.

Research Report:"The Need for Interpretable Features: Motivation and Taxonomy"


Related Links
Laboratory for Information and Decision Systems
All about the robots on Earth and beyond!


Thanks for being here;
We need your help. The SpaceDaily news network continues to grow but revenues have never been harder to maintain.

With the rise of Ad Blockers, and Facebook - our traditional revenue sources via quality network advertising continues to decline. And unlike so many other news sites, we don't have a paywall - with those annoying usernames and passwords.

Our news coverage takes time and effort to publish 365 days a year.

If you find our news sites informative and useful then please consider becoming a regular supporter or for now make a one off contribution.
SpaceDaily Contributor
$5 Billed Once


credit card or paypal
SpaceDaily Monthly Supporter
$5 Billed Monthly


paypal only


ROBO SPACE
'Fake' data helps robots learn the ropes faster
Ann Arbor, MI (SPX) Jun 30, 2022
In a step toward robots that can learn on the fly like humans do, a new approach expands training data sets for robots that work with soft objects like ropes and fabrics, or in cluttered environments. Developed by robotics researchers at the University of Michigan, it could cut learning time for new materials and environments down to a few hours rather than a week or two. In simulations, the expanded training data set improved the success rate of a robot looping a rope around an engine block ... read more

Comment using your Disqus, Facebook, Google or Twitter login.



Share this article via these popular social media networks
del.icio.usdel.icio.us DiggDigg RedditReddit GoogleGoogle

ROBO SPACE
Thermal drones seek survivors after deadly Italy glacier collapse

Key milestones achieved in Manned-Unmanned Teaming for future air power

Volatus Aerospace Introduces AERIEPORT, an Autonomous Remote Drone Nesting Station

Drone strike kills three in Iraqi Kurdistan: officials

ROBO SPACE
Automation and advanced materials are the "dream team"

Smart textiles sense how their users are moving

US giant 3M agrees big payout in Belgium chemical scandal

WVU researchers won't hit snooze on mattress recycling needs

ROBO SPACE
Giant Rashba semiconductors show unconventional dynamics

Physicists work to shrink microchips with first one-dimensional helium model system

Electrospinning promises major improvements in wearable technology

Nanostructured surfaces for future quantum computer chips

ROBO SPACE
Better estimating the risk of coastal flooding for nuclear power plants

EU Parliament backs green label for gas, nuclear

Framatome selected to provide full system decontamination at Bruce Power Units 3 and 4

Sweden's Vattenfall eyes small nuclear reactors

ROBO SPACE
Russians behind Mariupol theatre bombing; NATO denounce Russia's 'appalling cruelty'

Colombian Truth Commission hands over grim final report on civil conflict

Myanmar executions could be war crimes: UN

Senior IS official detained in Syria, US-led coalition says

ROBO SPACE
ECB urges banks to 'step up' climate risk management

Global effort to police 'greenwashing' begins to take shape

Divided MEPs to vote over EU green label for gas, nuclear

ECB unveils plan to push climate-friendly investments

ROBO SPACE
Volkswagen takes on US, China rivals with battery factory

HKUST develops world's most durable hydrogen fuel cell

Sieving carbons: Ideal anodes for high-energy sodium-ion batteries

Two opposing approaches could give lithium-sulfur batteries a leg up over lithium-ion

ROBO SPACE
Shenzhou XIII astronauts doing well after returning to Earth

Chinese official says its Mars sample mission will beat NASA back to Earth

China's deep space exploration laboratory starts operation

Shenzhou XIV taikonauts to conduct 24 medical experiments in space









The content herein, unless otherwise known to be public domain, are Copyright 1995-2024 - Space Media Network. All websites are published in Australia and are solely subject to Australian law and governed by Fair Use principals for news reporting and research purposes. AFP, UPI and IANS news wire stories are copyright Agence France-Presse, United Press International and Indo-Asia News Service. ESA news reports are copyright European Space Agency. All NASA sourced material is public domain. Additional copyrights may apply in whole or part to other bona fide parties. All articles labeled "by Staff Writers" include reports supplied to Space Media Network by industry news wires, PR agencies, corporate press officers and the like. Such articles are individually curated and edited by Space Media Network staff on the basis of the report's information value to our industry and professional readership. Advertising does not imply endorsement, agreement or approval of any opinions, statements or information provided by Space Media Network on any Web page published or hosted by Space Media Network. General Data Protection Regulation (GDPR) Statement Our advertisers use various cookies and the like to deliver the best ad banner available at one time. All network advertising suppliers have GDPR policies (Legitimate Interest) that conform with EU regulations for data collection. By using our websites you consent to cookie based advertising. If you do not agree with this then you must stop using the websites from May 25, 2018. Privacy Statement. Additional information can be found here at About Us.