Multimodal Machine Learning: Concepts, Challenges and Future Directions
Whenever we as human beings, want to achieve a goal, we need first make observations, collect data (sensing), build a mental model and then reason depending on the goal, and plans. Usually, to make perfect decisions, we do not just relay on one source of information. We, naturally, use our different senses to understand and analyze certain situations. The advances of technology including sensors, Internet of Things, social networks, and cloud data storage make it possible to generate and collect multimodal data. The traditional unimodal machine learning becomes insufficient and inefficient to process and understand multimodal data. Therefore, there is an urgent need to build and develop models that can process, relate, and understand information from multiple modalities.We provide an overview of multimodal machine learning. In particular, we present the principles, techniques, applications, challenges, and the future opportunities of multimodal machine learning.