Top Twitter Topics by Data Scientists #14

Trending this week: The importance of interpretable Machine Learning; Why ML struggles with causality; The ultimate guide to acing coding interviews for data scientists!

Every week we analyze the most discussed topics on Twitter by Data Science & AI influencers.

The following topics, URLs, resources, and tweets have been automatically extracted using a topic modeling technique based on Sentence BERT, which we have enhanced to fit our use case.

Want to know more about the methodology used? Jump into this article for more details, and find the codes in this Github repository!


This week, Data Science and AI influencers on Twitter have talked about:

  • Emerging Machine Learning Challenges
  • KDnuggets Updates
  • Cybersecurity 

The following sections provide all the details for each topic.

Emerging Machine Learning Challenges

This week, the influencers have talked about trending topics related to machine learning and its usage. In particular, they have focused on understanding how machine learning can serve humans and businesses, how machine learning models work, and how to use it in a safe way.

Andreas Staub shared a paper (Master’s Degree thesis) titled “Towards Usable Machine Learning” that addresses the ML usability challenges present in non-technical, high-stakes domains, through a case study in the domain of child welfare screening. It focuses on four key ML usability challenges, and honed in on one promising ML augmentation tool to address them.

He also shared an amazing article about “The Importance of Human Interpretable Machine Learning”, the first of a series of four articles covering this same topic. This post gives an exhaustive explanation of interpretable ML: the motivation, why to understand it, how important it is, what criteria to consider for model interpretation, and which scope to consider for interpretation (global or local?).

Finally, Ipfconline has shared a post about Federated Learning, a decentralized form of Machine Learning. It explains how using Federated Learning help to tackle privacy issues as it allows to train algorithms on devices distributed across a network, without the need for data to leave each device. Then, this article also details the five steps of Federated Learning, and its associated benefits and challenges. It provides some well-known examples of use cases that already use Federated Learning as well.

KDnuggets Updates

KDnuggets have shared some fresh articles this week, covering different topics varying from machine learning approaches to amazing coding tips.

First, they shared a post that explains why machine learning struggles with causality. In particular, it talks about the actual limits of machine learning approaches that are responsible for the lack of causal representations in machine learning models. Then, this post presents some directions that are being explored by researchers for adding causality to machine learning. One of them being combining machine learning mechanisms and structural causal models. Researchers say that combining causal graphs with machine learning will enable AI agents to create modules that can be applied to different tasks without much training.

On this same topic of causality, they also shared an article that they consider as a Must Know for Data Scientists and Data Analysts: Causal Design Patterns. This post deals with observational causal inference. It focuses on its use in the retail industry. To illustrate potential applications, the post provides a brief overview of different causal inference methods such as Stratification, Propensity Score Weighting, Regression Discontinuity, and Difference in Differences with motivating examples from consumer retail.

Lastly, they provide a post that they call The Ultimate Guide to Acing Code Interviews for Data Scientists, which “covers understanding the 4 types of coding interview questions and preparing for them effectively.” This post explains why coding questions are asked during data scientists’ interviews, for which roles in particular, and it provides very precise information on the four categories of coding interviews and their content.


This week, many data science influencers endorsed various updates in the cybersecurity domain.

Tamara McCleary shared an article on how the International Cyber Convention would be the future of Cybersecurity. She also shared two really interesting reads on the U.S. Cybersecurity Warning System and on the US Cybersecurity State Power Struggles.

Andreas Staub shared with his followers the 141 Cybersecurity Predictions published by Forbes.
The article talks about the role emerging technologies (AI, machine learning, 5G, quantum computing) and evolving technologies (IoT, mobile — including autonomous vehicles, cloud) will play in improving the efficiency and effectiveness, breadth and depth, of cyberattacks. Though slightly outdated, the article is totally on the point.

Ipfconline shared with the followers that Cybersecurity needs an API-first Revolution. Along with an article explaining what is DevSecOps and its importance.
And finally, Ronald van Loon shared an article on the 4 Key Considerations for Consistent IoT Manageability And Security.