{"id":917,"date":"2019-09-11T17:37:19","date_gmt":"2019-09-11T15:37:19","guid":{"rendered":"http:\/\/35.180.88.53\/?p=917"},"modified":"2019-09-22T15:07:31","modified_gmt":"2019-09-22T13:07:31","slug":"football-why-winners-win-and-losers-loose","status":"publish","type":"post","link":"https:\/\/www.sergilehkyi.com\/es\/2019\/09\/football-why-winners-win-and-losers-loose\/","title":{"rendered":"F\u00fatbol: por qu\u00e9 los ganadores ganan y los perdedores pierden"},"content":{"rendered":"\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\"><p>Explorando 5 a\u00f1os de f\u00fatbol europeo<\/p><\/blockquote>\n\n\n\n<h2 class=\"wp-block-heading\">Intro<\/h2>\n\n\n\n<p>En este cuaderno exploraremos m\u00e9tricas modernas en el f\u00fatbol (xG, xGA y xPTS) y su influencia en la anal\u00edtica deportiva.<\/p>\n\n\n\n<ul><li><strong>Expected Goals (xG)<\/strong>&nbsp;\u2013 mide la calidad de un disparo en funci\u00f3n de varias variables, como el tipo de asistencia, el \u00e1ngulo de disparo y la distancia desde la porter\u00eda, si fue un disparo a la cabeza y si se defini\u00f3 como una gran oportunidad.<\/li><li><strong>Expected Assits (xGA)<\/strong>&nbsp;\u2013 mide la probabilidad de que un pase dado se convierta en un gol de asistencia. Considera varios factores, incluido el tipo de pase, el punto final del pase y la longitud del pase.<\/li><li><strong>Expected Points (xPTS)<\/strong>&nbsp;\u2013 mide la probabilidad de que cierto juego traiga puntos al equipo.<\/li><\/ul>\n\n\n\n<p>Estas m\u00e9tricas nos permiten profundizar mucho m\u00e1s en las estad\u00edsticas de f\u00fatbol y comprender el rendimiento de los jugadores y los equipos en general y darnos cuenta del papel de la suerte y la habilidad en \u00e9l. Descargo de responsabilidad: ambos son importantes.<\/p>\n\n\n\n<p>El proceso de recopilaci\u00f3n de datos para este cuaderno se describe en este n\u00facleo de Kaggle:&nbsp;<a href=\"https:\/\/www.kaggle.com\/slehkyi\/web-scraping-football-statistics-2014-now\">Web Scraping Football Statistics<\/a><\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">import pandas as pd\nimport numpy as np\nimport matplotlib.pyplot as plt\nimport seaborn as sns\nimport collections\nimport warnings\n\nfrom IPython.core.display import display, HTML\n\n<em># import plotly <\/em>\nimport plotly\nimport plotly.figure_factory as ff\nimport plotly.graph_objs as go\nimport plotly.offline as py\nfrom plotly.offline import iplot, init_notebook_mode\nimport plotly.tools as tls\n\n<em># configure things<\/em>\nwarnings.filterwarnings('ignore')\n\npd.options.display.float_format = '<strong>{:,.2f}<\/strong>'.format  \npd.options.display.max_columns = 999\n\npy.init_notebook_mode(connected=True)\n\n%load_ext autoreload\n%autoreload 2\n\n%matplotlib inline\nsns.set()\n\n<em># !pip install plotly --upgrade<\/em><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Import de datos y EDA visual <\/h2>\n\n\n\n<pre class=\"wp-block-preformatted\">df = pd.read_csv('..\/input\/understat.com.csv')\ndf = df.rename(index=int, columns={'Unnamed: 0': 'league', 'Unnamed: 1': 'year'}) \ndf.head()<\/pre>\n\n\n\n<p>En la siguiente visualizaci\u00f3n, comprobaremos cu\u00e1ntos equipos de cada liga estuvieron entre los 4 mejores durante los \u00faltimos 5 a\u00f1os. Puede brindarnos informaci\u00f3n sobre la estabilidad de los mejores equipos de diferentes pa\u00edses.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">f = plt.figure(figsize=(25,12))\nax = f.add_subplot(2,3,1)\nplt.xticks(rotation=45)\nsns.barplot(x='team', y='pts', hue='year', data=df[(df['league'] == 'Bundesliga') & (df['position'] <= 4)], ax=ax)\nax = f.add_subplot(2,3,2)\nplt.xticks(rotation=45)\nsns.barplot(x='team', y='pts', hue='year', data=df[(df['league'] == 'EPL') &#038; (df['position'] <= 4)], ax=ax)\nax = f.add_subplot(2,3,3)\nplt.xticks(rotation=45)\nsns.barplot(x='team', y='pts', hue='year', data=df[(df['league'] == 'La_liga') &#038; (df['position'] <= 4)], ax=ax)\nax = f.add_subplot(2,3,4)\nplt.xticks(rotation=45)\nsns.barplot(x='team', y='pts', hue='year', data=df[(df['league'] == 'Serie_A') &#038; (df['position'] <= 4)], ax=ax)\nax = f.add_subplot(2,3,5)\nplt.xticks(rotation=45)\nsns.barplot(x='team', y='pts', hue='year', data=df[(df['league'] == 'Ligue_1') &#038; (df['position'] <= 4)], ax=ax)\nax = f.add_subplot(2,3,6)\nplt.xticks(rotation=45)\nsns.barplot(x='team', y='pts', hue='year', data=df[(df['league'] == 'RFPL') &#038; (df['position'] <= 4)], ax=ax)<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"519\" src=\"https:\/\/cdn.shortpixel.ai\/client\/q_glossy,ret_img,w_1024\/http:\/\/35.180.88.53\/wp-content\/uploads\/2019\/09\/league_data_1-1024x519.jpg\" alt=\"\" class=\"wp-image-918\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/league_data_1-1024x519.jpg 1024w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/league_data_1-300x152.jpg 300w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/league_data_1-768x389.jpg 768w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/league_data_1-1600x811.jpg 1600w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/league_data_1.jpg 1665w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>Como podemos ver en estos gr\u00e1ficos de barras, hay equipos que en los \u00faltimos 5 a\u00f1os estuvieron en el top 4 solo una vez, lo que significa que no es algo com\u00fan, lo que significa que si profundizamos, podemos encontrar que hay un factor de suerte que podr\u00eda haber jugado a favor de estos equipos. Es solo una teor\u00eda, as\u00ed que veamos m\u00e1s de cerca esos valores at\u00edpicos.<\/p>\n\n\n\n<p>Los equipos que estuvieron en el top 4 solo una vez durante las \u00faltimas 5 temporadas son:<\/p>\n\n\n\n<ul><li>Wolfsburg (2014) y Schalke 04 (2017) de la Bundesliga<\/li><li>Leicester (2015) de EPL<\/li><li>Villareal (2015) y Sevilla (2016) de La Liga<\/li><li>Lazio (2014) y Fiorentina (2014) de Serie A<\/li><li>Lille (2018) y Saint-Etienne (2018) de Ligue 1<\/li><li>FC Rostov (2015) y Dinamo Moscow (2014) de RFPL<\/li><\/ul>\n\n\n\n<p>Vamos a guardar estos equipos.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em># Removing unnecessary for our analysis columns <\/em>\ndf_xg = df[['league', 'year', 'position', 'team', 'scored', 'xG', 'xG_diff', 'missed', 'xGA', 'xGA_diff', 'pts', 'xpts', 'xpts_diff']]\n\noutlier_teams = ['Wolfsburg', 'Schalke 04', 'Leicester', 'Villareal', 'Sevilla', 'Lazio', 'Fiorentina', 'Lille', 'Saint-Etienne', 'FC Rostov', 'Dinamo Moscow']<\/pre>\n\n\n\n<pre class=\"wp-block-preformatted\"><em># Checking if getting the first place requires fenomenal execution<\/em>\nfirst_place = df_xg[df_xg['position'] == 1]\n\n<em># Get list of leagues<\/em>\nleagues = df['league'].drop_duplicates()\nleagues = leagues.tolist()\n\n<em># Get list of years<\/em>\nyears = df['year'].drop_duplicates()\nyears = years.tolist()<\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Comprender c\u00f3mo ganan los ganadores<\/h2>\n\n\n\n<p>In this section we will try to find some patterns that can help us understand what are some of the ingredients of the victory soup :D. Starting with Bundesliga.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Bundesliga<\/h3>\n\n\n\n<pre class=\"wp-block-preformatted\">first_place[first_place['league'] == 'Bundesliga']<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"795\" height=\"158\" src=\"https:\/\/cdn.shortpixel.ai\/client\/q_glossy,ret_img,w_795\/http:\/\/35.180.88.53\/wp-content\/uploads\/2019\/09\/bundesliga_1.jpg\" alt=\"\" class=\"wp-image-919\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/bundesliga_1.jpg 795w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/bundesliga_1-300x60.jpg 300w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/bundesliga_1-768x153.jpg 768w\" sizes=\"(max-width: 795px) 100vw, 795px\" \/><\/figure>\n\n\n\n<pre class=\"wp-block-preformatted\">pts = go.Bar(x = years, y = first_place['pts'][first_place['league'] == 'Bundesliga'], name = 'PTS')\nxpts = go.Bar(x = years, y = first_place['xpts'][first_place['league'] == 'Bundesliga'], name = 'Expected PTS')\n\ndata = [pts, xpts]\n\nlayout = go.Layout(\n    barmode='group',\n    title=\"Comparing Actual and Expected Points for Winner Team in Bundesliga\",\n    xaxis={'title': 'Year'},\n    yaxis={'title': \"Points\",\n    }\n)\n\nfig = go.Figure(data=data, layout=layout)\npy.iplot(fig)<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"864\" height=\"513\" src=\"https:\/\/cdn.shortpixel.ai\/client\/q_glossy,ret_img,w_864\/http:\/\/35.180.88.53\/wp-content\/uploads\/2019\/09\/bundesliga_xg_1.jpg\" alt=\"\" class=\"wp-image-920\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/bundesliga_xg_1.jpg 864w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/bundesliga_xg_1-300x178.jpg 300w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/bundesliga_xg_1-768x456.jpg 768w\" sizes=\"(max-width: 864px) 100vw, 864px\" \/><\/figure>\n\n\n\n<p>Al mirar la tabla y el gr\u00e1fico de barras, vemos que el Bayern cada a\u00f1o obtuvo m\u00e1s puntos de los que deber\u00edan tener, anotaron m\u00e1s de lo esperado y perdieron menos de lo esperado (excepto en 2018, que no rompi\u00f3 su plan de ganar la temporada, pero da algunas pistas de que el Bayern jug\u00f3 peor este a\u00f1o, aunque los competidores no lo aprovecharon).<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em># and from this table we see that Bayern dominates here totally, even when they do not play well<\/em>\ndf_xg[(df_xg['position'] <= 2) &#038; (df_xg['league'] == 'Bundesliga')].sort_values(by=['year','xpts'], ascending=False)<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"837\" height=\"274\" src=\"https:\/\/cdn.shortpixel.ai\/client\/q_glossy,ret_img,w_837\/http:\/\/35.180.88.53\/wp-content\/uploads\/2019\/09\/bundesliga_2.jpg\" alt=\"\" class=\"wp-image-921\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/bundesliga_2.jpg 837w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/bundesliga_2-300x98.jpg 300w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/bundesliga_2-768x251.jpg 768w\" sizes=\"(max-width: 837px) 100vw, 837px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\">La Liga<\/h3>\n\n\n\n<pre class=\"wp-block-preformatted\">first_place[first_place['league'] == 'La_liga']<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"766\" height=\"155\" src=\"https:\/\/cdn.shortpixel.ai\/client\/q_glossy,ret_img,w_766\/http:\/\/35.180.88.53\/wp-content\/uploads\/2019\/09\/laliga_1.jpg\" alt=\"\" class=\"wp-image-922\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/laliga_1.jpg 766w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/laliga_1-300x61.jpg 300w\" sizes=\"(max-width: 766px) 100vw, 766px\" \/><\/figure>\n\n\n\n<pre class=\"wp-block-preformatted\">pts = go.Bar(x = years, y = first_place['pts'][first_place['league'] == 'La_liga'], name = 'PTS')\nxpts = go.Bar(x = years, y = first_place['xpts'][first_place['league'] == 'La_liga'], name = 'Expected PTS')\n\ndata = [pts, xpts]\n\nlayout = go.Layout(\n    barmode='group',\n    title=\"Comparing Actual and Expected Points for Winner Team in La Liga\",\n    xaxis={'title': 'Year'},\n    yaxis={'title': \"Points\",\n    }\n)\n\nfig = go.Figure(data=data, layout=layout)\npy.iplot(fig)<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"843\" height=\"489\" src=\"https:\/\/cdn.shortpixel.ai\/client\/q_glossy,ret_img,w_843\/http:\/\/35.180.88.53\/wp-content\/uploads\/2019\/09\/laliga_xg_1.jpg\" alt=\"\" class=\"wp-image-923\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/laliga_xg_1.jpg 843w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/laliga_xg_1-300x174.jpg 300w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/laliga_xg_1-768x445.jpg 768w\" sizes=\"(max-width: 843px) 100vw, 843px\" \/><\/figure>\n\n\n\n<p>Como podemos ver en la tabla anterior, en 2014 y 2015, Barcelona estaba creando suficientes momentos para ganar el t\u00edtulo y no confiar en las habilidades personales o la suerte, de estos n\u00fameros podemos decir que THE Team estaba jugando all\u00ed.<\/p>\n\n\n\n<p>En 2016 hubo mucha competencia entre Madrid y Barcelona y al final Madrid tuvo m\u00e1s suerte \/ tuvo m\u00e1s agallas en un juego en particular (o Barcelona tuvo mala suerte \/ no ten\u00eda bolas) y fue el costo del t\u00edtulo. Estoy seguro de que si profundizamos esa temporada podemos encontrar ese partido en particular.<\/p>\n\n\n\n<p>En 2017 y 2018, el \u00e9xito de Barcelona se debi\u00f3 principalmente a las acciones de Lionel Messi, quien estaba anotando o haciendo asistencias en situaciones en las que los jugadores normales no har\u00edan eso. Lo que llev\u00f3 a tal salto en la diferencia de xPTS. Lo que me hace pensar (teniendo el contexto de que el Real Madrid es muy activo en el mercado de transferencias esta temporada) puede terminar mal. Solo opini\u00f3n subjetiva basada en n\u00fameros y viendo partidos de Barcelona. Realmente espero estar equivocado.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em># comparing with runner-up<\/em>\ndf_xg[(df_xg['position'] <= 2) &#038; (df_xg['league'] == 'La_liga')].sort_values(by=['year','xpts'], ascending=False)<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"786\" height=\"270\" src=\"https:\/\/cdn.shortpixel.ai\/client\/q_glossy,ret_img,w_786\/http:\/\/35.180.88.53\/wp-content\/uploads\/2019\/09\/laliga_2.jpg\" alt=\"\" class=\"wp-image-924\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/laliga_2.jpg 786w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/laliga_2-300x103.jpg 300w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/laliga_2-768x264.jpg 768w\" sizes=\"(max-width: 786px) 100vw, 786px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\">EPL<\/h3>\n\n\n\n<pre class=\"wp-block-preformatted\">first_place[first_place['league'] == 'EPL']<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"789\" height=\"157\" src=\"https:\/\/cdn.shortpixel.ai\/client\/q_glossy,ret_img,w_789\/http:\/\/35.180.88.53\/wp-content\/uploads\/2019\/09\/epl_1.jpg\" alt=\"\" class=\"wp-image-925\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/epl_1.jpg 789w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/epl_1-300x60.jpg 300w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/epl_1-768x153.jpg 768w\" sizes=\"(max-width: 789px) 100vw, 789px\" \/><\/figure>\n\n\n\n<pre class=\"wp-block-preformatted\">pts = go.Bar(x = years, y = first_place['pts'][first_place['league'] == 'EPL'], name = 'PTS')\nxpts = go.Bar(x = years, y = first_place['xpts'][first_place['league'] == 'EPL'], name = 'Expected PTS')\n\ndata = [pts, xpts]\n\nlayout = go.Layout(\n    barmode='group',\n    title=\"Comparing Actual and Expected Points for Winner Team in EPL\",\n    xaxis={'title': 'Year'},\n    yaxis={'title': \"Points\",\n    }\n)\n\nfig = go.Figure(data=data, layout=layout)\npy.iplot(fig)<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"831\" height=\"490\" src=\"https:\/\/cdn.shortpixel.ai\/client\/q_glossy,ret_img,w_831\/http:\/\/35.180.88.53\/wp-content\/uploads\/2019\/09\/epl_xg_1.jpg\" alt=\"\" class=\"wp-image-926\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/epl_xg_1.jpg 831w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/epl_xg_1-300x177.jpg 300w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/epl_xg_1-768x453.jpg 768w\" sizes=\"(max-width: 831px) 100vw, 831px\" \/><\/figure>\n\n\n\n<p>En EPL vemos la clara tendencia que te dice: \"Para ganar tienes que ser mejor que las estad\u00edsticas\". Un caso interesante aqu\u00ed es la historia de victoria de Leicester en 2015: \u00a1obtuvieron 12 puntos m\u00e1s de lo que deber\u00edan y al mismo tiempo el Arsenal obtuvo 6 puntos menos de lo esperado! Por eso amamos el f\u00fatbol, porque suceden cosas tan inexplicables. No estoy diciendo que sea una suerte total, pero jug\u00f3 su papel aqu\u00ed.<\/p>\n\n\n\n<p>Otra cosa interesante es el Manchester City de 2018: \u00a1son s\u00faper estables! Anotaron solo un gol m\u00e1s de lo esperado, fallaron 2 menos y obtuvieron 7 puntos adicionales, mientras que Liverpool luch\u00f3 realmente bien, tuvo un poco m\u00e1s de suerte de su lado, pero no pudo ganar a pesar de estar 13 puntos por delante de lo esperado.<\/p>\n\n\n\n<p>Pep est\u00e1 terminando de construir la m\u00e1quina de destrucci\u00f3n. Man City crea y convierte sus momentos en funci\u00f3n de la habilidad y no conf\u00eda en la suerte; los hace muy peligrosos en la pr\u00f3xima temporada.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em># comparing with runner-ups<\/em>\ndf_xg[(df_xg['position'] <= 2) &#038; (df_xg['league'] == 'EPL')].sort_values(by=['year','xpts'], ascending=False)<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"797\" height=\"266\" src=\"https:\/\/cdn.shortpixel.ai\/client\/q_glossy,ret_img,w_797\/http:\/\/35.180.88.53\/wp-content\/uploads\/2019\/09\/epl_2.jpg\" alt=\"\" class=\"wp-image-927\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/epl_2.jpg 797w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/epl_2-300x100.jpg 300w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/epl_2-768x256.jpg 768w\" sizes=\"(max-width: 797px) 100vw, 797px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\">Ligue 1<\/h3>\n\n\n\n<pre class=\"wp-block-preformatted\">first_place[first_place['league'] == 'Ligue_1']<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"806\" height=\"152\" src=\"https:\/\/cdn.shortpixel.ai\/client\/q_glossy,ret_img,w_806\/http:\/\/35.180.88.53\/wp-content\/uploads\/2019\/09\/ligue1_1.jpg\" alt=\"\" class=\"wp-image-928\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/ligue1_1.jpg 806w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/ligue1_1-300x57.jpg 300w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/ligue1_1-768x145.jpg 768w\" sizes=\"(max-width: 806px) 100vw, 806px\" \/><\/figure>\n\n\n\n<pre class=\"wp-block-preformatted\">pts = go.Bar(x = years, y = first_place['pts'][first_place['league'] == 'Ligue_1'], name = 'PTS')\nxpts = go.Bar(x = years, y = first_place['xpts'][first_place['league'] == 'Ligue_1'], name = 'Expected PTS')\n\ndata = [pts, xpts]\n\nlayout = go.Layout(\n    barmode='group',\n    title=\"Comparing Actual and Expected Points for Winner Team in Ligue 1\",\n    xaxis={'title': 'Year'},\n    yaxis={'title': \"Points\",\n    }\n)\n\nfig = go.Figure(data=data, layout=layout)\npy.iplot(fig)<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"829\" height=\"495\" src=\"https:\/\/cdn.shortpixel.ai\/client\/q_glossy,ret_img,w_829\/http:\/\/35.180.88.53\/wp-content\/uploads\/2019\/09\/ligue1_xg_1.jpg\" alt=\"\" class=\"wp-image-929\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/ligue1_xg_1.jpg 829w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/ligue1_xg_1-300x179.jpg 300w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/ligue1_xg_1-768x459.jpg 768w\" sizes=\"(max-width: 829px) 100vw, 829px\" \/><\/figure>\n\n\n\n<p>En la Ligue 1 francesa seguimos viendo la tendencia \"para ganar hay que ejecutar el 110%, porque el 100% no es suficiente\". Aqu\u00ed Paris Saint Germain domina totalmente. \u00a1Solo en 2016 tenemos un caso at\u00edpico frente a M\u00f3naco que anot\u00f3 30 goles m\u00e1s de lo esperado! y obtuve casi 17 puntos m\u00e1s de lo esperado! \u00bfSuerte? Muy buena parte de eso. El PSG fue bueno ese a\u00f1o, pero M\u00f3naco fue extraordinario. Nuevamente, no podemos afirmar que es pura suerte o pura habilidad, sino una combinaci\u00f3n perfecta de ambos en el lugar y el tiempo correctos.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em># comparing with runner-ups<\/em>\ndf_xg[(df_xg['position'] <= 2) &#038; (df_xg['league'] == 'Ligue_1')].sort_values(by=['year','xpts'], ascending=False)<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"806\" height=\"265\" src=\"https:\/\/cdn.shortpixel.ai\/client\/q_glossy,ret_img,w_806\/http:\/\/35.180.88.53\/wp-content\/uploads\/2019\/09\/ligue1_2.jpg\" alt=\"\" class=\"wp-image-930\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/ligue1_2.jpg 806w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/ligue1_2-300x99.jpg 300w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/ligue1_2-768x253.jpg 768w\" sizes=\"(max-width: 806px) 100vw, 806px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\">Serie A<\/h3>\n\n\n\n<pre class=\"wp-block-preformatted\">first_place[first_place['league'] == 'Serie_A']<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"749\" height=\"150\" src=\"https:\/\/cdn.shortpixel.ai\/client\/q_glossy,ret_img,w_749\/http:\/\/35.180.88.53\/wp-content\/uploads\/2019\/09\/serie_a_1.jpg\" alt=\"\" class=\"wp-image-931\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/serie_a_1.jpg 749w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/serie_a_1-300x60.jpg 300w\" sizes=\"(max-width: 749px) 100vw, 749px\" \/><\/figure>\n\n\n\n<pre class=\"wp-block-preformatted\">pts = go.Bar(x = years, y = first_place['pts'][first_place['league'] == 'Serie_A'], name = 'PTS')\nxpts = go.Bar(x = years, y = first_place['xpts'][first_place['league'] == 'Serie_A'], name = 'Expecetd PTS')\n\ndata = [pts, xpts]\n\nlayout = go.Layout(\n    barmode='group',\n    title=\"Comparing Actual and Expected Points for Winner Team in Serie A\",\n    xaxis={'title': 'Year'},\n    yaxis={'title': \"Points\",\n    }\n)\n\nfig = go.Figure(data=data, layout=layout)\npy.iplot(fig)<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"832\" height=\"500\" src=\"https:\/\/cdn.shortpixel.ai\/client\/q_glossy,ret_img,w_832\/http:\/\/35.180.88.53\/wp-content\/uploads\/2019\/09\/serie_a_xg_1.jpg\" alt=\"\" class=\"wp-image-934\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/serie_a_xg_1.jpg 832w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/serie_a_xg_1-300x180.jpg 300w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/serie_a_xg_1-768x462.jpg 768w\" sizes=\"(max-width: 832px) 100vw, 832px\" \/><\/figure>\n\n\n\n<p>En la Serie A italiana, la Juventus domina 8 a\u00f1os seguidos, aunque no puede mostrar ning\u00fan gran \u00e9xito en la Liga de Campeones. Creo que al revisar este cuadro y los n\u00fameros podemos entender que la Juve no tiene una competencia lo suficientemente fuerte dentro del pa\u00eds y obtiene muchos puntos \"afortunados\", lo que nuevamente se deriva de m\u00faltiples factores y podemos ver que Napoli super\u00f3 a la Juventus en xPTS dos veces, pero es una vida real y, por ejemplo, en 2017, la Juve estaba loca y anot\u00f3 26 goles adicionales (o cre\u00f3 goles de la nada), mientras que Napoli perdi\u00f3 3 m\u00e1s de lo esperado (debido a un error del portero o tal vez la excelencia de alg\u00fan equipo en 1 o 2 partidos particulares). Al igual que con la situaci\u00f3n en La Liga cuando el Real Madrid se convirti\u00f3 en campe\u00f3n, estoy seguro de que podemos encontrar 1 o 2 juegos que fueron clave ese a\u00f1o.<\/p>\n\n\n\n<p>Los detalles importan en el f\u00fatbol. Ves, un error aqu\u00ed, una carpinter\u00eda all\u00e1 y has perdido el t\u00edtulo.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em># comparing to runner-ups<\/em>\ndf_xg[(df_xg['position'] <= 2) &#038; (df_xg['league'] == 'Serie_A')].sort_values(by=['year','xpts'], ascending=False)<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"749\" height=\"270\" src=\"https:\/\/cdn.shortpixel.ai\/client\/q_glossy,ret_img,w_749\/http:\/\/35.180.88.53\/wp-content\/uploads\/2019\/09\/serie_a_2.jpg\" alt=\"\" class=\"wp-image-935\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/serie_a_2.jpg 749w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/serie_a_2-300x108.jpg 300w\" sizes=\"(max-width: 749px) 100vw, 749px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\">RFPL<\/h3>\n\n\n\n<pre class=\"wp-block-preformatted\">first_place[first_place['league'] == 'RFPL']<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"797\" height=\"154\" src=\"https:\/\/cdn.shortpixel.ai\/client\/q_glossy,ret_img,w_797\/http:\/\/35.180.88.53\/wp-content\/uploads\/2019\/09\/rfpl_1.jpg\" alt=\"\" class=\"wp-image-936\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/rfpl_1.jpg 797w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/rfpl_1-300x58.jpg 300w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/rfpl_1-768x148.jpg 768w\" sizes=\"(max-width: 797px) 100vw, 797px\" \/><\/figure>\n\n\n\n<pre class=\"wp-block-preformatted\">pts = go.Bar(x = years, y = first_place['pts'][first_place['league'] == 'RFPL'], name = 'PTS')\nxpts = go.Bar(x = years, y = first_place['xpts'][first_place['league'] == 'RFPL'], name = 'Expected PTS')\n\ndata = [pts, xpts]\n\nlayout = go.Layout(\n    barmode='group',\n    title=\"Comparing Actual and Expected Points for Winner Team in RFPL\",\n    xaxis={'title': 'Year'},\n    yaxis={'title': \"Points\",\n    }\n)\n\nfig = go.Figure(data=data, layout=layout)\npy.iplot(fig)<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"833\" height=\"491\" src=\"https:\/\/cdn.shortpixel.ai\/client\/q_glossy,ret_img,w_833\/http:\/\/35.180.88.53\/wp-content\/uploads\/2019\/09\/rfpl_xg.jpg\" alt=\"\" class=\"wp-image-937\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/rfpl_xg.jpg 833w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/rfpl_xg-300x177.jpg 300w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/rfpl_xg-768x453.jpg 768w\" sizes=\"(max-width: 833px) 100vw, 833px\" \/><\/figure>\n\n\n\n<p>No sigo a la Premier League rusa, as\u00ed que solo mirando fr\u00edamente los datos vemos el mismo patr\u00f3n que anotar m\u00e1s de lo que mereces y tambi\u00e9n una situaci\u00f3n interesante con CSKA Mosc\u00fa de 2015 a 2017. Durante estos a\u00f1os, estos muchachos fueron buenos, pero convirtieron sus ventajas solo una vez, los otros dos: si no se convierte, se lo castiga o su principal competidor simplemente se convierte mejor.<\/p>\n\n\n\n<p>No hay justicia en el f\u00fatbol :D. Aunque, creo que con VAR los n\u00fameros ser\u00e1n m\u00e1s estables en las pr\u00f3ximas temporadas. Porque una de las razones de esos objetivos y puntos adicionales son los errores de los \u00e1rbitros.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em># comparing to runner-ups<\/em>\ndf_xg[(df_xg['position'] <= 2) &#038; (df_xg['league'] == 'RFPL')].sort_values(by=['year','xpts'], ascending=False)<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"800\" height=\"270\" src=\"https:\/\/cdn.shortpixel.ai\/client\/q_glossy,ret_img,w_800\/http:\/\/35.180.88.53\/wp-content\/uploads\/2019\/09\/rfpl_2.jpg\" alt=\"\" class=\"wp-image-938\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/rfpl_2.jpg 800w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/rfpl_2-300x101.jpg 300w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/rfpl_2-768x259.jpg 768w\" sizes=\"(max-width: 800px) 100vw, 800px\" \/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Resumen estad\u00edstico<\/h2>\n\n\n\n<p>Como hay 6 ligas con diferentes equipos y estad\u00edsticas, decid\u00ed centrarme en una al principio para probar diferentes enfoques y luego replicar el modelo de an\u00e1lisis final en otras 5. Y como veo principalmente La Liga, comenzar\u00e9 con esta competencia mientras saber m\u00e1s al respecto.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em># Creating separate DataFrames per each league<\/em>\nlaliga = df_xg[df_xg['league'] == 'La_liga']\nlaliga.reset_index(inplace=True)\nepl = df_xg[df_xg['league'] == 'EPL']\nepl.reset_index(inplace=True)\nbundesliga = df_xg[df_xg['league'] == 'Bundesliga']\nbundesliga.reset_index(inplace=True)\nseriea = df_xg[df_xg['league'] == 'Serie_A']\nseriea.reset_index(inplace=True)\nligue1 = df_xg[df_xg['league'] == 'Ligue_1']\nligue1.reset_index(inplace=True)\nrfpl = df_xg[df_xg['league'] == 'RFPL']\nrfpl.reset_index(inplace=True)<\/pre>\n\n\n\n<pre class=\"wp-block-preformatted\">laliga.describe()<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"743\" height=\"216\" src=\"https:\/\/cdn.shortpixel.ai\/client\/q_glossy,ret_img,w_743\/http:\/\/35.180.88.53\/wp-content\/uploads\/2019\/09\/describe_1.jpg\" alt=\"\" class=\"wp-image-939\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/describe_1.jpg 743w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/describe_1-300x87.jpg 300w\" sizes=\"(max-width: 743px) 100vw, 743px\" \/><\/figure>\n\n\n\n<pre class=\"wp-block-preformatted\">def print_records_antirecords(df):\n  print('Presenting some records and antirecords: <strong>n<\/strong>')\n  for col <strong>in<\/strong> df.describe().columns:\n    if col <strong>not<\/strong> <strong>in<\/strong> ['index', 'year', 'position']:\n      team_min = df['team'].loc[df[col] == df.describe().loc['min',col]].values[0]\n      year_min = df['year'].loc[df[col] == df.describe().loc['min',col]].values[0]\n      team_max = df['team'].loc[df[col] == df.describe().loc['max',col]].values[0]\n      year_max = df['year'].loc[df[col] == df.describe().loc['max',col]].values[0]\n      val_min = df.describe().loc['min',col]\n      val_max = df.describe().loc['max',col]\n      print('The lowest value of <strong>{0}<\/strong> had <strong>{1}<\/strong> in <strong>{2}<\/strong> and it is equal to <strong>{3:.2f}<\/strong>'.format(col.upper(), team_min, year_min, val_min))\n      print('The highest value of <strong>{0}<\/strong> had <strong>{1}<\/strong> in <strong>{2}<\/strong> and it is equal to <strong>{3:.2f}<\/strong>'.format(col.upper(), team_max, year_max, val_max))\n      print('='*100)\n      \n<em># replace laliga with any league you want<\/em>\nprint_records_antirecords(laliga)<\/pre>\n\n\n\n<pre class=\"wp-block-preformatted\">Presentando algunos records y antirecords:\nEl valor m\u00e1s bajo de SCORED tuvo C\u00f3rdoba en 2014 y es igual a 22.00\nEl valor m\u00e1s alto de SCORED tuvo Real Madrid en 2014 y es igual a 118.00\n================================================================\nEl valor m\u00e1s bajo de XG tuvo Eibar en 2014 y es igual a 29.56\nEl valor m\u00e1s alto de XG tuvo Barcelona en 2015 y es igual a 113.60\n================================================================\nEl valor m\u00e1s bajo de XG_DIFF tuvo Barcelona en 2016 y es igual a -22.45\nEl valor m\u00e1s alto de XG_DIFF tuvo Las Palmas en 2017 y es igual a 13.88\n================================================================\nEl valor m\u00e1s bajo de MISSED tuvo el Atl\u00e9tico de Madrid en 2015 y es igual a 18.00\nEl valor m\u00e1s alto de MISSED tuvo Osasuna en 2016 y es igual a 94.00\n================================================================\n El valor m\u00e1s bajo de XGA tuvo el Atl\u00e9tico de Madrid en 2015 y es igual a 27.80\n El valor m\u00e1s alto de XGA tuvo Levante en 2018 y es igual a 78.86\n ================================================== ==============\n El valor m\u00e1s bajo de XGA_DIFF ten\u00eda Osasuna en 2016 y es igual a -29.18\n El valor m\u00e1s alto de XGA_DIFF tuvo Valencia en 2015 y es igual a 13.69\n ================================================== ==============\n El valor m\u00e1s bajo de PTS tuvo C\u00f3rdoba en 2014 y es igual a 20.00\n El valor m\u00e1s alto de PTS tuvo Barcelona en 2014 y es igual a 94.00\n ================================================== ==============\n El valor m\u00e1s bajo de XPTS tuvo Granada en 2016 y es igual a 26.50\n El valor m\u00e1s alto de XPTS tuvo Barcelona en 2015 y es igual a 94.38\n ================================================== ==============\n El valor m\u00e1s bajo de XPTS_DIFF tuvo el Atl\u00e9tico de Madrid en 2017 y es igual a -17,40\n El valor m\u00e1s alto de XPTS_DIFF tuvo el Deportivo La Coru\u00f1a en 2017 y es igual a 20.16<\/pre>\n\n\n\n<pre class=\"wp-block-preformatted\">trace0 = go.Scatter(\n    x = laliga['position'][laliga['year'] == 2014], \n    y = laliga['xG_diff'][laliga['year'] == 2014],\n    name = '2014',\n    mode = 'lines+markers'\n)\n\ntrace1 = go.Scatter(\n    x = laliga['position'][laliga['year'] == 2015], \n    y = laliga['xG_diff'][laliga['year'] == 2015],\n    name='2015',\n    mode = 'lines+markers'\n)\n\ntrace2 = go.Scatter(\n    x = laliga['position'][laliga['year'] == 2016], \n    y = laliga['xG_diff'][laliga['year'] == 2016],\n    name='2016',\n    mode = 'lines+markers'\n)\n\ntrace3 = go.Scatter(\n    x = laliga['position'][laliga['year'] == 2017], \n    y = laliga['xG_diff'][laliga['year'] == 2017],\n    name='2017',\n    mode = 'lines+markers'\n)\n\ntrace4 = go.Scatter(\n    x = laliga['position'][laliga['year'] == 2018], \n    y = laliga['xG_diff'][laliga['year'] == 2018],\n    name='2018',\n    mode = 'lines+markers'\n)\n\ndata = [trace0, trace1, trace2, trace3, trace4]\n\nlayout = go.Layout(\n    title=\"Comparing xG gap between positions\",\n    xaxis={'title': 'Year'},\n    yaxis={'title': \"xG difference\",\n    }\n)\n\nfig = go.Figure(data=data, layout=layout)\npy.iplot(fig)<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"700\" height=\"450\" src=\"https:\/\/cdn.shortpixel.ai\/client\/q_glossy,ret_img,w_700\/http:\/\/35.180.88.53\/wp-content\/uploads\/2019\/09\/xg_gap_1.png\" alt=\"\" class=\"wp-image-940\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/xg_gap_1.png 700w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/xg_gap_1-300x193.png 300w\" sizes=\"(max-width: 700px) 100vw, 700px\" \/><\/figure>\n\n\n\n<pre class=\"wp-block-preformatted\">trace0 = go.Scatter(\n    x = laliga['position'][laliga['year'] == 2014], \n    y = laliga['xGA_diff'][laliga['year'] == 2014],\n    name = '2014',\n    mode = 'lines+markers'\n)\n\ntrace1 = go.Scatter(\n    x = laliga['position'][laliga['year'] == 2015], \n    y = laliga['xGA_diff'][laliga['year'] == 2015],\n    name='2015',\n    mode = 'lines+markers'\n)\n\ntrace2 = go.Scatter(\n    x = laliga['position'][laliga['year'] == 2016], \n    y = laliga['xGA_diff'][laliga['year'] == 2016],\n    name='2016',\n    mode = 'lines+markers'\n)\n\ntrace3 = go.Scatter(\n    x = laliga['position'][laliga['year'] == 2017], \n    y = laliga['xGA_diff'][laliga['year'] == 2017],\n    name='2017',\n    mode = 'lines+markers'\n)\n\ntrace4 = go.Scatter(\n    x = laliga['position'][laliga['year'] == 2018], \n    y = laliga['xGA_diff'][laliga['year'] == 2018],\n    name='2018',\n    mode = 'lines+markers'\n)\n\ndata = [trace0, trace1, trace2, trace3, trace4]\n\nlayout = go.Layout(\n    title=\"Comparing xGA gap between positions\",\n    xaxis={'title': 'Year'},\n    yaxis={'title': \"xGA difference\",\n    }\n)\n\nfig = go.Figure(data=data, layout=layout)\npy.iplot(fig)<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"700\" height=\"450\" src=\"https:\/\/cdn.shortpixel.ai\/client\/q_glossy,ret_img,w_700\/http:\/\/35.180.88.53\/wp-content\/uploads\/2019\/09\/xga_gap.png\" alt=\"\" class=\"wp-image-941\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/xga_gap.png 700w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/xga_gap-300x193.png 300w\" sizes=\"(max-width: 700px) 100vw, 700px\" \/><\/figure>\n\n\n\n<pre class=\"wp-block-preformatted\">trace0 = go.Scatter(\n    x = laliga['position'][laliga['year'] == 2014], \n    y = laliga['xpts_diff'][laliga['year'] == 2014],\n    name = '2014',\n    mode = 'lines+markers'\n)\n\ntrace1 = go.Scatter(\n    x = laliga['position'][laliga['year'] == 2015], \n    y = laliga['xpts_diff'][laliga['year'] == 2015],\n    name='2015',\n    mode = 'lines+markers'\n)\n\ntrace2 = go.Scatter(\n    x = laliga['position'][laliga['year'] == 2016], \n    y = laliga['xpts_diff'][laliga['year'] == 2016],\n    name='2016',\n    mode = 'lines+markers'\n)\n\ntrace3 = go.Scatter(\n    x = laliga['position'][laliga['year'] == 2017], \n    y = laliga['xpts_diff'][laliga['year'] == 2017],\n    name='2017',\n    mode = 'lines+markers'\n)\n\ntrace4 = go.Scatter(\n    x = laliga['position'][laliga['year'] == 2018], \n    y = laliga['xpts_diff'][laliga['year'] == 2018],\n    name='2018',\n    mode = 'lines+markers'\n)\n\ndata = [trace0, trace1, trace2, trace3, trace4]\n\nlayout = go.Layout(\n    title=\"Comparing xPTS gap between positions\",\n    xaxis={'title': 'Position'},\n    yaxis={'title': \"xPTS difference\",\n    }\n)\n\nfig = go.Figure(data=data, layout=layout)\npy.iplot(fig)<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"700\" height=\"450\" src=\"https:\/\/cdn.shortpixel.ai\/client\/q_glossy,ret_img,w_700\/http:\/\/35.180.88.53\/wp-content\/uploads\/2019\/09\/xpts_gap.png\" alt=\"\" class=\"wp-image-942\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/xpts_gap.png 700w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/xpts_gap-300x193.png 300w\" sizes=\"(max-width: 700px) 100vw, 700px\" \/><\/figure>\n\n\n\n<p>De las tablas anteriores podemos ver claramente que los mejores equipos obtienen m\u00e1s puntos, conceden menos y obtienen m\u00e1s puntos de lo esperado. Es por eso que estos equipos son los mejores equipos. Y situaci\u00f3n totalmente opuesta con extra\u00f1os. Los equipos del medio juego promedio. Totalmente l\u00f3gico, no hay grandes ideas aqu\u00ed.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em># Check mean differences<\/em>\ndef get_diff_means(df):  \n  dm = df.groupby('year')[['xG_diff', 'xGA_diff', 'xpts_diff']].mean()\n  \n  return dm\n\nmeans = get_diff_means(laliga)\nmeans<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"236\" height=\"173\" src=\"https:\/\/cdn.shortpixel.ai\/client\/q_glossy,ret_img\/http:\/\/35.180.88.53\/wp-content\/uploads\/2019\/09\/mean_diffs.jpg\" alt=\"\" class=\"wp-image-943\"\/><\/figure>\n\n\n\n<pre class=\"wp-block-preformatted\"><em># Check median differences<\/em>\ndef get_diff_medians(df):  \n  dm = df.groupby('year')[['xG_diff', 'xGA_diff', 'xpts_diff']].median()\n  \n  return dm\n\nmedians = get_diff_medians(laliga)\nmedians<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"235\" height=\"172\" src=\"https:\/\/cdn.shortpixel.ai\/client\/q_glossy,ret_img\/http:\/\/35.180.88.53\/wp-content\/uploads\/2019\/09\/median_diffs.jpg\" alt=\"\" class=\"wp-image-944\"\/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"Outliers-Detection\">Detecci\u00f3n de valores at\u00edpicos<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Z-Score<\/h3>\n\n\n\n<p>Z-Score es el n\u00famero de desviaciones est\u00e1ndar de la media de un punto de datos. Podemos usarlo para encontrar valores at\u00edpicos en nuestro conjunto de datos suponiendo que | z-score | > 3 es un valor at\u00edpico.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em># Getting outliers for xG using zscore<\/em>\nfrom scipy.stats import zscore\n<em># laliga[(np.abs(zscore(laliga[['xG_diff']])) > 2.0).all(axis=1)]<\/em>\ndf_xg[(np.abs(zscore(df_xg[['xG_diff']])) > 3.0).all(axis=1)]<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"764\" height=\"197\" src=\"https:\/\/cdn.shortpixel.ai\/client\/q_glossy,ret_img,w_764\/http:\/\/35.180.88.53\/wp-content\/uploads\/2019\/09\/z_score_1.jpg\" alt=\"\" class=\"wp-image-945\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/z_score_1.jpg 764w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/z_score_1-300x77.jpg 300w\" sizes=\"(max-width: 764px) 100vw, 764px\" \/><\/figure>\n\n\n\n<pre class=\"wp-block-preformatted\"><em># outliers for xGA<\/em>\n<em># laliga[(np.abs(zscore(laliga[['xGA_diff']])) > 2.0).all(axis=1)]<\/em>\ndf_xg[(np.abs(zscore(df_xg[['xGA_diff']])) > 3.0).all(axis=1)]<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"745\" height=\"64\" src=\"https:\/\/cdn.shortpixel.ai\/client\/q_glossy,ret_img,w_745\/http:\/\/35.180.88.53\/wp-content\/uploads\/2019\/09\/z_score_2.jpg\" alt=\"\" class=\"wp-image-946\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/z_score_2.jpg 745w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/z_score_2-300x26.jpg 300w\" sizes=\"(max-width: 745px) 100vw, 745px\" \/><\/figure>\n\n\n\n<pre class=\"wp-block-preformatted\"><em># Outliers for xPTS<\/em>\n<em># laliga[(np.abs(zscore(laliga[['xpts_diff']])) > 2.0).all(axis=1)]<\/em>\ndf_xg[(np.abs(zscore(df_xg[['xpts_diff']])) > 3.0).all(axis=1)]<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"788\" height=\"130\" src=\"https:\/\/cdn.shortpixel.ai\/client\/q_glossy,ret_img,w_788\/http:\/\/35.180.88.53\/wp-content\/uploads\/2019\/09\/z_score_3.jpg\" alt=\"\" class=\"wp-image-947\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/z_score_3.jpg 788w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/z_score_3-300x49.jpg 300w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/z_score_3-768x127.jpg 768w\" sizes=\"(max-width: 788px) 100vw, 788px\" \/><\/figure>\n\n\n\n<p>12 valores at\u00edpicos en total detectados con z-score. Pobre Osasuna en 2016: casi 30 goles no merecidos.<\/p>\n\n\n\n<p>Como podemos ver en estos datos, estar en un espacio at\u00edpico a\u00fan no te hace ganar la temporada. Pero si pierde sus oportunidades o recibe objetivos donde no deber\u00eda hacerlo y hace demasiado, merece la relegaci\u00f3n. Perder y ser promedio es mucho m\u00e1s f\u00e1cil que ganar.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Rango intercuartil (IQR)<\/h3>\n\n\n\n<p>IQR - es la diferencia entre el primer cuartil y el tercer cuartil de un conjunto de datos. Esta es una forma de describir la propagaci\u00f3n de un conjunto de datos.<\/p>\n\n\n\n<p>Una regla de uso com\u00fan dice que un punto de datos es un valor at\u00edpico si est\u00e1 a m\u00e1s de 1.5 \u22c5RQ por encima del tercer cuartil o por debajo del primer cuartil. Dicho de otra manera, los valores at\u00edpicos bajos est\u00e1n por debajo de Q1 - 1.5 \u22c5 IQR y los valores at\u00edpicos altos est\u00e1n por encima de Q3 + 1.5 \u22c5 IQR.<\/p>\n\n\n\n<p>Vamos a ver.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em># Trying different method of outliers detection<\/em>\ndf_xg.describe()<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"693\" height=\"222\" src=\"https:\/\/cdn.shortpixel.ai\/client\/q_glossy,ret_img,w_693\/http:\/\/35.180.88.53\/wp-content\/uploads\/2019\/09\/describe_2.jpg\" alt=\"\" class=\"wp-image-948\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/describe_2.jpg 693w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/describe_2-300x96.jpg 300w\" sizes=\"(max-width: 693px) 100vw, 693px\" \/><\/figure>\n\n\n\n<pre class=\"wp-block-preformatted\"><em># using Interquartile Range Method to identify outliers<\/em>\n<em># xG_diff<\/em>\niqr_xG = (df_xg.describe().loc['75%','xG_diff'] - df_xg.describe().loc['25%','xG_diff']) * 1.5\nupper_xG = df_xg.describe().loc['75%','xG_diff'] + iqr_xG\nlower_xG = df_xg.describe().loc['25%','xG_diff'] - iqr_xG\n\nprint('IQR for xG_diff: <strong>{:.2f}<\/strong>'.format(iqr_xG))\nprint('Upper border for xG_diff: <strong>{:.2f}<\/strong>'.format(upper_xG))\nprint('Lower border for xG_diff: <strong>{:.2f}<\/strong>'.format(lower_xG))\n\noutliers_xG = df_xg[(df_xg['xG_diff'] > upper_xG) | (df_xg['xG_diff'] < lower_xG)]\nprint('='*50)\n\n<em># xGA_diff<\/em>\niqr_xGA = (df_xg.describe().loc['75%','xGA_diff'] - df_xg.describe().loc['25%','xGA_diff']) * 1.5\nupper_xGA = df_xg.describe().loc['75%','xGA_diff'] + iqr_xGA\nlower_xGA = df_xg.describe().loc['25%','xGA_diff'] - iqr_xGA\n\nprint('IQR for xGA_diff: <strong>{:.2f}<\/strong>'.format(iqr_xGA))\nprint('Upper border for xGA_diff: <strong>{:.2f}<\/strong>'.format(upper_xGA))\nprint('Lower border for xGA_diff: <strong>{:.2f}<\/strong>'.format(lower_xGA))\n\noutliers_xGA = df_xg[(df_xg['xGA_diff'] > upper_xGA) | (df_xg['xGA_diff'] < lower_xGA)]\nprint('='*50)\n\n<em># xpts_diff<\/em>\niqr_xpts = (df_xg.describe().loc['75%','xpts_diff'] - df_xg.describe().loc['25%','xpts_diff']) * 1.5\nupper_xpts = df_xg.describe().loc['75%','xpts_diff'] + iqr_xpts\nlower_xpts = df_xg.describe().loc['25%','xpts_diff'] - iqr_xpts\n\nprint('IQR for xPTS_diff: <strong>{:.2f}<\/strong>'.format(iqr_xpts))\nprint('Upper border for xPTS_diff: <strong>{:.2f}<\/strong>'.format(upper_xpts))\nprint('Lower border for xPTS_diff: <strong>{:.2f}<\/strong>'.format(lower_xpts))\n\noutliers_xpts = df_xg[(df_xg['xpts_diff'] > upper_xpts) | (df_xg['xpts_diff'] < lower_xpts)]\nprint('='*50)\n\noutliers_full = pd.concat([outliers_xG, outliers_xGA, outliers_xpts])\noutliers_full = outliers_full.drop_duplicates()\n<\/pre>\n\n\n\n<pre class=\"wp-block-preformatted\">IQR for xG_diff: 13.16\nUpper border for xG_diff: 16.65\nLower border for xG_diff: -18.43\n==================================================\nIQR for xGA_diff: 13.95\nUpper border for xGA_diff: 17.15\nLower border for xGA_diff: -20.05\n==================================================\nIQR for xPTS_diff: 13.93\nUpper border for xPTS_diff: 18.73\nLower border for xPTS_diff: -18.41\n==================================================<\/pre>\n\n\n\n<pre class=\"wp-block-preformatted\"><em># Adding ratings bottom to up to find looser in each league (different amount of teams in every league so I can't do just n-20)<\/em>\nmax_position = df_xg.groupby('league')['position'].max()\ndf_xg['position_reverse'] = np.nan\noutliers_full['position_reverse'] = np.nan\n\nfor i, row <strong>in<\/strong> df_xg.iterrows():\n  df_xg.at[i, 'position_reverse'] = np.abs(row['position'] - max_position[row['league']])+1\n  \nfor i, row <strong>in<\/strong> outliers_full.iterrows():\n  outliers_full.at[i, 'position_reverse'] = np.abs(row['position'] - max_position[row['league']])+1<\/pre>\n\n\n\n<pre class=\"wp-block-preformatted\">total_count = df_xg[(df_xg['position'] <= 4) | (df_xg['position_reverse'] <= 3)].count()[0]\noutlier_count = outliers_full[(outliers_full['position'] <= 4) | (outliers_full['position_reverse'] <= 3)].count()[0]\noutlier_prob = outlier_count \/ total_count\nprint('Probability of outlier in top or bottom of the final table: <strong>{:.2%}<\/strong>'.format(outlier_prob))<\/pre>\n\n\n\n<pre class=\"wp-block-preformatted\">Probability of outlier in top or bottom of the final table: 8.10%<\/pre>\n\n\n\n<p>Entonces, podemos decir que es muy probable que cada a\u00f1o en una de las 6 ligas haya un equipo que obtenga un boleto para la Liga de Campeones o Europa Legue con la ayuda de la suerte adem\u00e1s de sus grandes habilidades o haya un perdedor que obtenga a la segunda divisi\u00f3n, porque no pueden convertir sus momentos.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em># 1-3 outliers among all leagues in a year<\/em>\ndata = pd.DataFrame(outliers_full.groupby('league')['year'].count()).reset_index()\ndata = data.rename(index=int, columns={'year': 'outliers'})\nsns.barplot(x='league', y='outliers', data=data)\n<em># no outliers in Bundesliga<\/em><\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"398\" height=\"274\" src=\"https:\/\/cdn.shortpixel.ai\/client\/q_glossy,ret_img,w_398\/http:\/\/35.180.88.53\/wp-content\/uploads\/2019\/09\/outliers.jpg\" alt=\"\" class=\"wp-image-949\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/outliers.jpg 398w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/outliers-300x207.jpg 300w\" sizes=\"(max-width: 398px) 100vw, 398px\" \/><\/figure>\n\n\n\n<p>Nuestros ganadores y perdedores con un rendimiento brillante y un rendimiento inferior brillante.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">top_bottom = outliers_full[(outliers_full['position'] <= 4) | (outliers_full['position_reverse'] <= 3)].sort_values(by='league')\ntop_bottom<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"819\" height=\"499\" src=\"https:\/\/cdn.shortpixel.ai\/client\/q_glossy,ret_img,w_819\/http:\/\/35.180.88.53\/wp-content\/uploads\/2019\/09\/top_bottom.jpg\" alt=\"\" class=\"wp-image-950\" srcset=\"https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/top_bottom.jpg 819w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/top_bottom-300x183.jpg 300w, https:\/\/www.sergilehkyi.com\/wp-content\/uploads\/2019\/09\/top_bottom-768x468.jpg 768w\" sizes=\"(max-width: 819px) 100vw, 819px\" \/><\/figure>\n\n\n\n<pre class=\"wp-block-preformatted\"><em># Let's get back to our list of teams that suddenly got into top. Was that because of unbeliavable mix of luck and skill?<\/em>\not = [x for x  <strong>in<\/strong> outlier_teams if x <strong>in<\/strong> top_bottom['team'].drop_duplicates().tolist()]\not\n<em># The answer is absolutely no. They just played well during 1 season. Sometimes that happen.<\/em><\/pre>\n\n\n\n<pre class=\"wp-block-preformatted\">[]<\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Conclusiones<\/h2>\n\n\n\n<p>El f\u00fatbol es un juego de bajo puntaje y un gol puede cambiar la imagen completa del juego e incluso los resultados finales. Es por eso que el an\u00e1lisis a largo plazo le da una mejor idea de la situaci\u00f3n.<\/p>\n\n\n\n<p>Con la introducci\u00f3n de la m\u00e9trica xG (y otras que se derivan de esto) ahora realmente podemos evaluar el rendimiento del equipo a largo plazo y comprender la diferencia entre los mejores equipos, los equipos de clase media y los extra\u00f1os absolutos.<\/p>\n\n\n\n<p>xG trae nuevos argumentos a las discusiones sobre el f\u00fatbol, lo que lo hace a\u00fan m\u00e1s interesante. Y al mismo tiempo, el juego no pierde este factor de incertidumbre y la posibilidad de que ocurran locuras. En realidad ahora, estas locuras tienen una oportunidad de ser explicadas.<\/p>\n\n\n\n<p>Al final, hemos descubierto que hay casi un 100% de posibilidades de que ocurra algo extra\u00f1o en una de las ligas. Es solo cuesti\u00f3n de tiempo lo \u00e9pico que ser\u00e1.<\/p>\n\n\n\n<hr class=\"wp-block-separator\"\/>\n\n\n\n<p>Se puede encontrar trabajo original con gr\u00e1ficos interactivos\u00a0<a rel=\"noreferrer noopener\" href=\"https:\/\/www.kaggle.com\/slehkyi\/football-why-winners-win-and-losers-loose\" target=\"_blank\">aqu\u00ed.<\/a><\/p>\n\n\n\n<hr class=\"wp-block-separator\"\/>\n\n\n\n<p style=\"text-align:center\">Foto de\u00a0<a href=\"https:\/\/unsplash.com\/@viennachanges?utm_source=unsplash&#038;utm_medium=referral&#038;utm_content=creditCopyText\">Vienna Reyes<\/a>\u00a0en\u00a0<a href=\"https:\/\/unsplash.com\/search\/photos\/soccer?utm_source=unsplash&#038;utm_medium=referral&#038;utm_content=creditCopyText\">Unsplash<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Explorando 5 a\u00f1os de f\u00fatbol europeo Intro En este cuaderno exploraremos m\u00e9tricas modernas en el f\u00fatbol (xG, xGA y xPTS)&hellip;<\/p>\n","protected":false},"author":1,"featured_media":952,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[11],"tags":[],"translation":{"provider":"WPGlobus","version":"3.0.0","language":"es","enabled_languages":["gb","es","uk"],"languages":{"gb":{"title":true,"content":true,"excerpt":false},"es":{"title":true,"content":true,"excerpt":false},"uk":{"title":true,"content":true,"excerpt":false}}},"_links":{"self":[{"href":"https:\/\/www.sergilehkyi.com\/es\/wp-json\/wp\/v2\/posts\/917"}],"collection":[{"href":"https:\/\/www.sergilehkyi.com\/es\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.sergilehkyi.com\/es\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.sergilehkyi.com\/es\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.sergilehkyi.com\/es\/wp-json\/wp\/v2\/comments?post=917"}],"version-history":[{"count":26,"href":"https:\/\/www.sergilehkyi.com\/es\/wp-json\/wp\/v2\/posts\/917\/revisions"}],"predecessor-version":[{"id":998,"href":"https:\/\/www.sergilehkyi.com\/es\/wp-json\/wp\/v2\/posts\/917\/revisions\/998"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.sergilehkyi.com\/es\/wp-json\/wp\/v2\/media\/952"}],"wp:attachment":[{"href":"https:\/\/www.sergilehkyi.com\/es\/wp-json\/wp\/v2\/media?parent=917"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.sergilehkyi.com\/es\/wp-json\/wp\/v2\/categories?post=917"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.sergilehkyi.com\/es\/wp-json\/wp\/v2\/tags?post=917"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}