{"id":482,"date":"2018-08-12T19:35:41","date_gmt":"2018-08-12T17:35:41","guid":{"rendered":"http:\/\/35.180.88.53\/?p=482"},"modified":"2018-08-12T19:49:02","modified_gmt":"2018-08-12T17:49:02","slug":"data-science-in-action-analyzing-air-pollution-co-in-madrid","status":"publish","type":"post","link":"https:\/\/www.sergilehkyi.com\/es\/2018\/08\/data-science-in-action-analyzing-air-pollution-co-in-madrid\/","title":{"rendered":"Data Science in Action: Analyzing Air Pollution (CO) in Madrid"},"content":{"rendered":"\n<p>In the recent years, the high levels of pollution during certain dry periods in Madrid has forced the authorities to take measures against the use of cars in the city center, and has been used as a reason to propose drastic modifications in the city&#8217;s urbanization. Thanks to\u00a0<a href=\"https:\/\/datos.madrid.es\/portal\/site\/egob\">Madrid&#8217;s City Council Open Data website<\/a>, the air quality data has been uploaded and is publicly available. There are several data sets, including\u00a0<a href=\"https:\/\/datos.madrid.es\/sites\/v\/index.jsp?vgnextoid=aecb88a7e2b73410VgnVCM2000000c205a0aRCRD&amp;vgnextchannel=374512b9ace9f310VgnVCM100000171f5a0aRCRD\">daily\u00a0<\/a>and\u00a0<a href=\"https:\/\/datos.madrid.es\/sites\/v\/index.jsp?vgnextoid=f3c0f7d512273410VgnVCM2000000c205a0aRCRD&amp;vgnextchannel=374512b9ace9f310VgnVCM100000171f5a0aRCRD\">hourly<\/a>\u00a0historical data of the pollution levels registered from 2001 to 2018 and\u00a0<a href=\"https:\/\/datos.madrid.es\/sites\/v\/index.jsp?vgnextoid=9e42c176313eb410VgnVCM1000000b205a0aRCRD&amp;vgnextchannel=374512b9ace9f310VgnVCM100000171f5a0aRCRD\">the list of stations being used<\/a>\u00a0for pollution and other particles analysis in the city.<\/p>\n\n\n\n<p>The dataset is really huge, so I decided to focus my analysis only on one pollutant &#8211; carbon monoxide (CO). The data was presented hourly, for each of 24 different stations, for each day from 2001 to 2018, although the data on this pollutant for 2002, 2006-2010 was missing.\u00a0<\/p>\n\n\n\n<p>To get a general picture I found mean value for each day of the year, based on all 24 stations. This process took about an hour on my laptop. By performing this operation and plotting received data we already can make some conclusions.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"640\" height=\"480\" src=\"http:\/\/35.180.88.53\/wp-content\/uploads\/2018\/08\/final_plot_CO_mean.png\" alt=\"\" class=\"wp-image-483\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2018\/08\/final_plot_CO_mean.png 640w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2018\/08\/final_plot_CO_mean-300x225.png 300w\" sizes=\"(max-width: 640px) 100vw, 640px\" \/><\/figure>\n\n\n\n<p>As we can see from the above plot, actions that have been taken made a positive impact on levels of carbon monoxide in the city. During the years the mean volume of this pollutant drastically decreased.<\/p>\n\n\n\n<p>As I am not a person who usually works with air quality and doesn&#8217;t understand it&#8217;s mechanics I was wondering why the plot has this shape of triangles. Based on the graph we can easily say that some months the level of pollution is higher and some is lower. I wanted to know those periods.<\/p>\n\n\n\n<p>To find out this, I decided to spot the max and min values for each year and put them into separate tables. Also I&#8217;ve put those values on a plot. Seriously, didn&#8217;t get any valuable information, but it helps to explain a little trick. So, when I first plotted those values, the graph looked like this:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"640\" height=\"480\" src=\"http:\/\/35.180.88.53\/wp-content\/uploads\/2018\/08\/min_max_not_scaled.png\" alt=\"\" class=\"wp-image-484\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2018\/08\/min_max_not_scaled.png 640w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2018\/08\/min_max_not_scaled-300x225.png 300w\" sizes=\"(max-width: 640px) 100vw, 640px\" \/><\/figure>\n\n\n\n<p>You see, because of such a huge difference between max and min values we cannot see a trend on min values &#8211; it&#8217;s almost a straight line. We can improve it by changing the &#8220;Y&#8221; scale to logarithmic.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">plt.yscale('log')<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"640\" height=\"480\" src=\"http:\/\/35.180.88.53\/wp-content\/uploads\/2018\/08\/min_max_plot.png\" alt=\"\" class=\"wp-image-485\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2018\/08\/min_max_plot.png 640w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2018\/08\/min_max_plot-300x225.png 300w\" sizes=\"(max-width: 640px) 100vw, 640px\" \/><figcaption>Much better, no?<\/figcaption><\/figure>\n\n\n\n<p>So back to our min and max values for each year. Below you will see tables that explain everything completely. (months are in numbers, I hope you understand that 1 is January and 8 is August, although for the next time I will create a function that will translate those numbers to human language :D)<\/p>\n\n\n\n<ul class=\"wp-block-gallery alignwide columns-2 is-cropped wp-block-gallery-1 is-layout-flex wp-block-gallery-is-layout-flex\"><li class=\"blocks-gallery-item\"><figure><img loading=\"lazy\" decoding=\"async\" width=\"497\" height=\"193\" src=\"http:\/\/35.180.88.53\/wp-content\/uploads\/2018\/08\/max_values_table.png\" alt=\"\" data-id=\"486\" data-link=\"http:\/\/35.180.88.53\/?attachment_id=486\" class=\"wp-image-486\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2018\/08\/max_values_table.png 497w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2018\/08\/max_values_table-300x116.png 300w\" sizes=\"(max-width: 497px) 100vw, 497px\" \/><figcaption>MAX volumes<\/figcaption><\/figure><\/li><li class=\"blocks-gallery-item\"><figure><img loading=\"lazy\" decoding=\"async\" width=\"508\" height=\"195\" src=\"http:\/\/35.180.88.53\/wp-content\/uploads\/2018\/08\/min_values_table.png\" alt=\"\" data-id=\"487\" data-link=\"http:\/\/35.180.88.53\/?attachment_id=487\" class=\"wp-image-487\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2018\/08\/min_values_table.png 508w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2018\/08\/min_values_table-300x115.png 300w\" sizes=\"(max-width: 508px) 100vw, 508px\" \/><figcaption>MIN volumes<br\/><\/figcaption><\/figure><\/li><\/ul>\n\n\n\n<p>I was actually surprised that maximum pollution falls into winter months and minimum &#8211; into summer. So I used my <a href=\"http:\/\/35.180.88.53\/the-algorithm-to-solve-almost-any-issue-with-your-computer\/\">solving-problem algorithm<\/a> to find an answer. And here is what I&#8217;ve got.<\/p>\n\n\n\n<hr class=\"wp-block-separator\"\/>\n\n\n\n<p class=\"has-regular-font-size\"><em>Some sources of pollution, like industrial emissions, stay fairly constant throughout the year, no matter what the season. But roaring fireplaces and wood stoves and idling vehicles in the winter all add up to higher levels of particulate matter (the particles that make up smoke) and carbon monoxide (from vehicle emissions).<\/em><\/p>\n\n\n\n<p class=\"has-regular-font-size\"><em>On top of this, cold temperatures and stagnant air have a way of creating a build-up of these substances near the ground, particularly during a weather phenomenon called temperature inversions. In other seasons or weather conditions, warm air sits near the ground and the air can rise easily and carry away pollutants. In a temperature inversion, cold air is trapped near the ground by a layer of warm air. The warm air acts like a lid, holding these substances down. During a temperature inversion, smoke can\u2019t rise and carbon monoxide can reach unhealthy levels. From an air quality perspective, storms are a welcome weather event. Wind, rain and snow storms are sometimes called scrubbers because they help clear out and disperse substances of concern.<\/em><\/p>\n\n\n\n<p>More detailed info can be found here &#8211; <a href=\"https:\/\/airlief.com\/air-pollution-during-winter\/\" target=\"_blank\">&#8220;Why Is Air Pollution Worse During Winter?&#8221;<\/a> and here &#8211; <a href=\"http:\/\/www.fortair.org\/how-cold-weather-affects-air-quality\/\" target=\"_blank\">&#8220;How Cold Weather Affects Air Quality&#8221;<\/a><\/p>\n\n\n\n<hr class=\"wp-block-separator\"\/>\n\n\n\n<p>I, as a person that lives in the city, thought that air pollution is worse in summer, but the data says completely opposite and I cannot argue with it &#8211; that&#8217;s why I love Data Science, that&#8217;s why I love what I do, because data never lies. Yes, you, as a human, can make incorrect interpretation or an error in the code which will distort the results, but nonetheless, data. never. lies.<\/p>\n\n\n\n<p>Hopefully this article also opened someone&#8217;s eyes and if it did, please leave a comment, you are definitely not alone \ud83d\ude42<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In the recent years, the high levels of pollution during certain dry periods in Madrid has forced the authorities to&hellip;<\/p>\n","protected":false},"author":1,"featured_media":488,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4],"tags":[],"translation":{"provider":"WPGlobus","version":"3.0.0","language":"es","enabled_languages":["gb","es","uk"],"languages":{"gb":{"title":true,"content":true,"excerpt":false},"es":{"title":false,"content":false,"excerpt":false},"uk":{"title":false,"content":false,"excerpt":false}}},"_links":{"self":[{"href":"https:\/\/www.sergilehkyi.com\/es\/wp-json\/wp\/v2\/posts\/482"}],"collection":[{"href":"https:\/\/www.sergilehkyi.com\/es\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.sergilehkyi.com\/es\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.sergilehkyi.com\/es\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.sergilehkyi.com\/es\/wp-json\/wp\/v2\/comments?post=482"}],"version-history":[{"count":3,"href":"https:\/\/www.sergilehkyi.com\/es\/wp-json\/wp\/v2\/posts\/482\/revisions"}],"predecessor-version":[{"id":491,"href":"https:\/\/www.sergilehkyi.com\/es\/wp-json\/wp\/v2\/posts\/482\/revisions\/491"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.sergilehkyi.com\/es\/wp-json\/wp\/v2\/media\/488"}],"wp:attachment":[{"href":"https:\/\/www.sergilehkyi.com\/es\/wp-json\/wp\/v2\/media?parent=482"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.sergilehkyi.com\/es\/wp-json\/wp\/v2\/categories?post=482"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.sergilehkyi.com\/es\/wp-json\/wp\/v2\/tags?post=482"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}