The 37th annual International Conference on Machine Learning is an academic conference focused on machine learning held from July 12-18, 2020. The conference was hosted virtually this year and across seven days, 10,800+ attendees from 75 countries joined in for workshops, tutorials and virtual sessions.
Jayeeta and Anchal, members of the Indellient Data Science team attended the conference as well as hosted workshops and sessions. Indellient is a software and consulting firm that is working at the heart of today’s Fortune 100 companies. Using cutting edge technology and best practices, we drive client businesses forward and provide efficient solutions to complex challenges. In today’s blog, they will provide us with a breakdown of the topics they presented and their learning experiences.
Anchal Gupta: Machine Learning Engineer
I am Anchal Gupta, Machine Learning Engineer at Indellient. I am passionate about solving real-world problems with Artificial Intelligence that can help society. I have also published my research in Reinforcement Learning, Natural Language Processing, and Big Data at different international conferences.
I love to collaborate with people intermingling science and technology to make the world comfortable for everybody. I have received my Master of Science in Data Analytics from Pennsylvania State University and my Bachelor of Engineering in Computer Science from Chitkara University, India.
It was a dream come true to share my research at one of the most prestigious conferences in Machine Learning. It was a great experience to interact with so many scholarly women researching AI. I presented a part of my Master thesis (@Pennsylvania State University – under supervision of Dr. Youakim Badr, Dr. Robin Qiu, Dr. Ashkan Negahban) as a poster at the Women in Machine Learning workshop. The research proposes a methodology to control heating systems in Smart Buildings using Reinforcement Learning.
What is Reinforcement Learning?
Reinforcement Learning (RL) is a type of machine learning technique that interacts with a physical world environment. It uses a trial and error process to learn from the feedback received on its actions and experiences. Unlike supervised learning, RL does not have a supervisor to guide the correct steps. Instead, it is rewarded or penalized for every action it takes.
Since RL is a trial and error process, it requires a lot of experience to converge. Therefore, it is challenging to apply in fields where simulated data is not readily available.
Using Reinforcement Learning for Climate Control in Smart Buildings
An increase in energy consumption is one of the biggest problems prevailing in the energy systems domain. Buildings account for 40% of the world’s global energy consumption. Among the various components in a building like heating systems, cooling systems, electric appliances, lights, etc., Heating systems have the highest energy costs. Even though so much energy is wasted in buildings, users are still found complaining about discomfort due to inappropriate temperature settings.
Traditional heating controllers are inefficient due to a lack of adaptability to dynamic conditions such as changing user preferences and outside temperature patterns. Therefore, it is necessary to design energy-efficient controllers that can improvise occupant thermal comfort while reducing energy consumption.
Our research presents a Deep Reinforcement Learning (DRL)-based heating controller to improve occupant comfort and minimize smart buildings’ energy costs. We perform extensive simulation experiments using real-world outside temperature data. The results show that the DRL-based smart controller outperforms a traditional thermostat controller by improving occupant comfort between 15% – 30% and reducing energy costs between 5% – 12%.
Anchal’s Slides – Presented at ICML – Women in Machine Learning Workshop
Jayeeta Putatunda: Sr Data Scientist
I am Jayeeta Putatunda, Senior Data Scientist at Indellient. I work extensively on NLP projects where I get to explore a lot of state-of-the-art models and build cool products. I am passionate about exploring new concepts in the data science domain and firmly believe that data is the best storyteller.
I am an active speaker in various meetups and conferences and also hosted workshops on various Machine Learning and NLP topics. I am also engaged with some amazing organizations to promote and inspire more women to take up STEM. I have received my Master of Science in Quantitative Methods and Modeling from City University of New York, NY, and my Bachelor of Science in Economics and Statistics from West Bengal State University, India.
Attending this year’s virtual ICML conference has been a great learning experience. Listening to research groups from Google Brain, Amazon Science, Apple present some of their recent work and ask them questions first hand has been very informative.
Trending Topic: Bias in Machine Learning
One of the prominent themes in this year’s conference was Machine Learning fairness. A lot of research work presented was based around ways to make Machine Learning and AI more transparent and fair. Dealing with data bias was also a big discussed topic. Amazon researchers won a best-paper award for a paper titled “Fair Bayesian Optimization”.
Natural Language Processing for Social Good
I also had the great opportunity to share the platform with these giants and present sessions on “NLP for Social Good”.
Consider various popular social media channels: Facebook, Twitter; to Forums like Quora, Reddit, various blogs, and even news platforms. All of these platforms are constantly generating a vast amount of unstructured text data; NLP is the underlying methodology to process, analyze and understand the vast amount of text data available, that humans can no longer analyze efficiently.
What if we take a step further and apply these technologies for social good? Some organizations are trying to address certain problem areas with NLP. Unfortunately the relative effect of “goodness” is minimal without clear guidelines, public policy, and funding.
In the session I discussed why NLP is difficult, and what are some of the biggest challenges. We also discussed some organizations and NGOs handling some very challenging problems in the areas of tech in social good. I went through some of the positive results of Google AI Impact Challenge that enabled 20 organizations to contribute positively to the world.
One of the use-cases we discussed in-depth was Suicide Prevention. WHO’s data from 2018-2019 showed 16.3% of people in India died by suicide. US also has a staggering 15.3% of suicide rate. Could the application of NLP assist governments and health care bodies to identify potential indicators of such behavior and serve targeted help?
Performing Exploratory Data Analysis (EDA) on the myriad of data highlights the value in terms of sentiment analysis and entity recognition. By addressing the challenges of synonymity, and ambiguity of the text data recent language models have taken a step forward to:
- extract key information
- identify clusters
- categorize text
This can form a contextual knowledge graph of the massive unstructured text and potentially could highlight a pattern (keywords) of the pre-suicidal activities and conversations.
Could the application of NLP assist health care bodies and immediate family members to identify potential indicators of such nuanced behavior and serve targeted help? A brainstorming session with the participants brought to light some effective solutions using current technology. Solutions also addressed common concerns including data privacy, ethical usage and safeguarding user data.
Mentoring Session at ICML’20
I was also chosen by the ICML committee to lead two mentoring sessions. I felt fortunate enough to connect with grad students and PhD. researchers to discuss and address their questions about transitioning into the data science industry:
- What does a data scientist’s day-to-day work looks like
- What kinds of problems we handle in the industry and the production lifecycle of DS products
- Ways they can build up their work portfolios
A major theme that came across was – there’s so many things to learn, how to go about it?
I believe it’s very important to create a step-by-step plan and be disciplined in our approach if we want to learn something from scratch. Implementing code is the easiest part, but to develop the acumen to understand the underlying methodology of various models is very crucial. In this overcrowded space of data science, the only way to shine through is to know your base well. It was interesting to learn from the participants as well about their experience and their vision in this constantly evolving space of Data Science.