🕹️
Demo
Figure 1: Graph of the monthly average PM 2.5 for the year 2023
Figure 2: Showing Map plots for the following question
Nearly 6.7 million lives are lost due to air pollution every year. While policymakers are working on the mitigation strategies, public awareness can help reduce the exposure to air pollution. Air pollution data from government-installed sensors is often publicly available in raw format, but there is a non-trivial barrier for various stakeholders in deriving meaningful insights from that data. In this work, we present VayuBuddy, a Large Language Model (LLM)-powered chatbot system to reduce the barrier between the stakeholders and air quality sensor data. VayuBuddy receives the questions in natural language, analyses the structured sensory data with a LLM-generated Python code and provides answers in natural language. We use the data from Indian government air quality sensors. We benchmark the capabilities of 7 LLMs on 45 diverse question-answer pairs prepared by us. Additionally, VayuBuddy can also generate visual analysis such as line-plots, map plot, bar charts and many others from the sensory data as we demonstrate in this work.
Central Pollution Control Board (CPCB), India. Covers 537 monitoring stations across 279 cities in 31 states. 7 years of PM2.5 data (2017–2023) resampled to daily averages.
Using the above theorem, probability statements about the sample mean can be approximated using a normal distribution. It’s the probability statements that are being approximated, not the random variable itself.
Figure 3: Flowchart of the VayuBuddy Chatbot
* I have a pandas dataframe data of PM2.5 and PM10. * The columns are 'Timestamp', 'station', 'PM2.5', 'PM10', 'address', 'city', 'latitude', 'longitude', and 'state'. * Frequency of data is daily. * `pollution` generally means `PM2.5`. * You already have df, so don't read the csv file. * Don't print anything, but save result in a variable `answer` and make it global. * Unless explicitly mentioned, don't consider the result as a plot. * PM2.5 guidelines: India: 60, WHO: 15. * PM10 guidelines: India: 100, WHO: 50. * If result is a plot, show the India and WHO guidelines in the plot. * If result is a plot make it in tight layout, save it and save path in `answer`. Example: `answer='plot.png'`. Use uuid to save the plot. * If result is a plot, rotate x-axis tick labels by 45 degrees. * If result is not a plot, save it as a string in `answer`. Example: `answer='The city is Mumbai'`. * I have a geopandas.geodataframe india containing the coordinates required to plot Indian Map with states. * If the query asks you to plot on India Map, use that geodataframe to plot and then add more points as per the requirements using the similar code as follows: `v = ax.scatter(df['longitude'], df['latitude'])`. If the colorbar is required, use the following code: `plt.colorbar(v)`. * If the query asks you to plot on India Map, plot the India Map in Beige color. * Whenever you do any sort of aggregation, report the corresponding standard deviation, standard error and the number of data points for that aggregation. * Whenever you're reporting a floating point number, round it to 2 decimal places. * Always report the unit of the data. Example: `The average PM2.5 is 45.67 µg/m³`. * If a colorbar is plotted and it represents air quality, use `Reds` cmap.
Table 1: Overall Performance of LLMs on all evaluation queries.
Model | # Params | Score (out of 45) |
---|---|---|
Llama3.1 | 70B | 39 |
Llama3 | 70B | 38 |
Codestral | 22B | 29 |
Mixtral | 56B | 26 |
Llama3.1 | 8B | 23 |
Llama3 | 8B | 21 |
Gemma | 9B | 19 |
Codestral Mamba | 7B | 19 |
Mistral | 7B | 8 |
Gemma | 7B | 7 |
Table 2: Performance of LLMs on different stakeholders.
Model | Policymakers | AQ Researcher | Lung Patients | Parents | Public |
---|---|---|---|---|---|
Llama3-70b | 19 | 15 | 16 | 20 | 24 |
Mixtral | 15 | 11 | 11 | 15 | 14 |
Gemma-7b | 2 | 2 | 6 | 6 | 4 |
Llama3.1-70b | 20 | 16 | 16 | 20 | 24 |
Codestral Mamba | 8 | 7 | 8 | 9 | 13 |
Codestral | 14 | 12 | 15 | 18 | 19 |
Mistral 7B | 5 | 3 | 3 | 3 | 5 |
Llama3-8b | 11 | 7 | 11 | 14 | 15 |
Llama3.1-8b | 10 | 8 | 12 | 13 | 17 |
Gemma-9b | 8 | 7 | 12 | 13 | 12 |
We observe that in almost all cases LLMs were able to generate either errorless or faulty Python codes. We rarely see a case where any code is not generated.
Llama3 provides a good balance between code generation and general knowledge. Code based LLMs failed at questions which required prior information about lockdown and festival seasons, while models Gemma and Mistral lack pretraining on codes
For Question “Which state in India currently has the highest PM2.5 levels?”
Out of 4 models in the given bot, Llama3 and Gemini Pro gave the answers.
The findings showed different answers for the same question. Llama3 gave the answer as “Delhi”, while Gemini Pro gave the answer as “Maharashtra”.
Figure 4: Gemini Pro answer
Figure 5: Llama3 answer
Some observations revealed that the models could generate code for the question and also provide an answer.
For the above questions generated from ChatGPT, only two models, Llama3 and Gemini Pro, were able to answer.
The Llama3 model answered all questions correctly, which were generated by ChatGPT.
We observed that only the Llama3 model was able to generate graphs and map plots for related questions.
As a testament to its innovative design and real-world relevance, VayuBuddy has been featured in multiple newspaper articles. These articles provide insights into its applications, societal impact, and reception by users.