In the rapidly evolving world of technology, the concept of Autonomous Flying Networks (FNs) is gaining significant traction. These networks, composed of Unmanned Aerial Vehicles (UAVs), are poised to revolutionize connectivity in dynamic and infrastructure-limited environments. However, current approaches to FNs have largely overlooked a crucial aspect: the autonomous perception of users and their service demands. This is where the Multi-Agent Perception System (MAPS) comes into play.
Developed by a team of researchers including Diogo Ferreira, Pedro Ribeiro, André Coelho, and Rui Campos, MAPS is a modular and scalable system designed to interpret visual and audio data collected by UAVs. The system leverages multi-modal large language models (MM-LLMs) and agentic Artificial Intelligence (AI) to generate Service Level Specifications (SLSs). These SLSs provide detailed descriptions of user count, spatial distribution, and traffic demand, which are essential for zero-touch network operation.
The effectiveness of MAPS was evaluated using a synthetic multimodal emergency dataset. The results were impressive, with user detection accuracies exceeding 70% and SLS generation times under 130 seconds in 90% of cases. The study also demonstrated that combining audio and visual modalities significantly enhances user detection. This suggests that MAPS could provide the perception layer necessary for autonomous, zero-touch FNs.
The implications of this research are far-reaching. By enabling UAVs to autonomously perceive and respond to user demands, MAPS could pave the way for more efficient and effective on-demand connectivity solutions. This could be particularly beneficial in emergency situations or areas with limited infrastructure, where traditional connectivity solutions may fall short. As such, the development of MAPS represents a significant step forward in the field of autonomous flying networks.



