{"id":594,"date":"2025-01-25T16:08:00","date_gmt":"2025-01-25T15:08:00","guid":{"rendered":"https:\/\/noiseonthenet.space\/noise\/?p=594"},"modified":"2025-01-26T20:27:05","modified_gmt":"2025-01-26T19:27:05","slug":"meet-the-pandas","status":"publish","type":"post","link":"https:\/\/noiseonthenet.space\/noise\/2025\/01\/meet-the-pandas\/","title":{"rendered":"Meet the Pandas"},"content":{"rendered":"<p> <img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/noiseonthenet.space\/noise\/wp-content\/uploads\/2025\/01\/thomas-bonometti-OyO5NDiRPMM-unsplash-1.jpg?ssl=1\" alt=\"thomas-bonometti-OyO5NDiRPMM-unsplash.jpg\" \/> Photo by <a href=\"https:\/\/unsplash.com\/@bonopeppers?utm_content=creditCopyText&amp;utm_medium=referral&amp;utm_source=unsplash\">Thomas Bonometti<\/a> on <a href=\"https:\/\/unsplash.com\/photos\/sun-bear-lying-on-logs-OyO5NDiRPMM?utm_content=creditCopyText&amp;utm_medium=referral&amp;utm_source=unsplash\">Unsplash<\/a> <\/p>\n\n<p> We started our space trip to the galaxy of Python Analytics <a href=\"https:\/\/noiseonthenet.space\/noise\/2025\/01\/a-trip-to-jupyter-lab\/\">heading onto Jupyter<\/a> . <\/p>\n\n<p> Now it&rsquo;s time to meet some of the most fascinating inhabitants: <a href=\"https:\/\/pandas.pydata.org\/docs\/\">the pandas<\/a> <\/p>\n\n<p> The code, datasets and jupyter notebook for the posts in this series are available in this <a href=\"https:\/\/github.com\/noiseOnTheNet\/python-post023_jupyter_analitics\">repository<\/a> <\/p>\n\n<p> <a id=\"org934f50f\"><\/a> <\/p>\n<div id=\"outline-container-using-pandas-basic-introduction\" class=\"outline-2\">\n<h2 id=\"using-pandas-basic-introduction\">Using Pandas (basic introduction)<\/h2>\n<div class=\"outline-text-2\" id=\"text-using-pandas-basic-introduction\">\n<p> Pandas is a library to work with data using relational tables <\/p>\n\n<p> To prepare for this lesson execute the following cell <\/p>\n\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-bash\" id=\"nil\"><span style=\"color: #89dceb;\">!<\/span>git clone https:\/\/github.com\/datasciencedojo\/datasets.git\n<\/pre>\n<\/div>\n\n<em><\/em>\n<pre class=\"example\" id=\"nil\">\nCloning into 'datasets'...\n<\/pre>\n\n<p> <a id=\"org5bc373c\"><\/a> import the pandas library and assign it a shorter alias <\/p>\n\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-python\" id=\"nil\"><span style=\"color: #cba6f7;\">import<\/span> pandas <span style=\"color: #cba6f7;\">as<\/span> pd\n<\/pre>\n<\/div>\n\n<p> <a id=\"orgb35a481\"><\/a> <\/p>\n<\/div>\n<div id=\"outline-container-loading-data\" class=\"outline-3\">\n<h3 id=\"loading-data\">Loading data<\/h3>\n<div class=\"outline-text-3\" id=\"text-loading-data\">\n<p> Pandas includes a rich set of input functions that allow you to get data from various file types <\/p>\n\n<table border=\"2\" cellspacing=\"0\" cellpadding=\"6\" rules=\"groups\" frame=\"hsides\">\n\n\n<colgroup>\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-left\" \/>\n<\/colgroup>\n<thead>\n<tr>\n<th scope=\"col\" class=\"org-left\">function<\/th>\n<th scope=\"col\" class=\"org-left\">format<\/th>\n<th scope=\"col\" class=\"org-left\">notes<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"org-left\"><code>pd.read_csv<\/code><\/td>\n<td class=\"org-left\">textual csv<\/td>\n<td class=\"org-left\">&#xa0;<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\"><code>pd.read_excel<\/code><\/td>\n<td class=\"org-left\">binary excel format<\/td>\n<td class=\"org-left\">requires external library<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\"><code>pd.read_parquet<\/code><\/td>\n<td class=\"org-left\">fast binary columnar format<\/td>\n<td class=\"org-left\">requires pyarrow<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n\n<p> A data frame contains many functions to explore it e.g. the <code>.head()<\/code> method shows the first lines of a data frame <\/p>\n\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-python\" id=\"nil\"><span style=\"color: #cdd6f4;\">titanic<\/span> <span style=\"color: #89dceb;\">=<\/span> pd.read_csv(<span style=\"color: #a6e3a1;\">\"datasets\/titanic.csv\"<\/span>)\ntitanic.head()\n<\/pre>\n<\/div>\n\n<table border=\"2\" cellspacing=\"0\" cellpadding=\"6\" rules=\"groups\" frame=\"hsides\">\n\n\n<colgroup>\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-left\" \/>\n<\/colgroup>\n<thead>\n<tr>\n<th scope=\"col\" class=\"org-right\">&#xa0;<\/th>\n<th scope=\"col\" class=\"org-right\">PassengerId<\/th>\n<th scope=\"col\" class=\"org-right\">Survived<\/th>\n<th scope=\"col\" class=\"org-right\">Pclass<\/th>\n<th scope=\"col\" class=\"org-left\">Name<\/th>\n<th scope=\"col\" class=\"org-left\">Sex<\/th>\n<th scope=\"col\" class=\"org-right\">Age<\/th>\n<th scope=\"col\" class=\"org-right\">SibSp<\/th>\n<th scope=\"col\" class=\"org-right\">Parch<\/th>\n<th scope=\"col\" class=\"org-left\">Ticket<\/th>\n<th scope=\"col\" class=\"org-right\">Fare<\/th>\n<th scope=\"col\" class=\"org-right\">Cabin<\/th>\n<th scope=\"col\" class=\"org-left\">Embarked<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-right\">3<\/td>\n<td class=\"org-left\">Braund, Mr. Owen Harris<\/td>\n<td class=\"org-left\">male<\/td>\n<td class=\"org-right\">22.0<\/td>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-left\">A\/5 21171<\/td>\n<td class=\"org-right\">7.25<\/td>\n<td class=\"org-right\">nan<\/td>\n<td class=\"org-left\">S<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-right\">2<\/td>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-left\">Cumings, Mrs. John Bradley (Florence Briggs Thayer)<\/td>\n<td class=\"org-left\">female<\/td>\n<td class=\"org-right\">38.0<\/td>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-left\">PC 17599<\/td>\n<td class=\"org-right\">71.2833<\/td>\n<td class=\"org-right\">C85<\/td>\n<td class=\"org-left\">C<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">2<\/td>\n<td class=\"org-right\">3<\/td>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-right\">3<\/td>\n<td class=\"org-left\">Heikkinen, Miss. Laina<\/td>\n<td class=\"org-left\">female<\/td>\n<td class=\"org-right\">26.0<\/td>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-left\">STON\/O2. 3101282<\/td>\n<td class=\"org-right\">7.925<\/td>\n<td class=\"org-right\">nan<\/td>\n<td class=\"org-left\">S<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">3<\/td>\n<td class=\"org-right\">4<\/td>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-left\">Futrelle, Mrs. Jacques Heath (Lily May Peel)<\/td>\n<td class=\"org-left\">female<\/td>\n<td class=\"org-right\">35.0<\/td>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-left\">113803<\/td>\n<td class=\"org-right\">53.1<\/td>\n<td class=\"org-right\">C123<\/td>\n<td class=\"org-left\">S<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">4<\/td>\n<td class=\"org-right\">5<\/td>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-right\">3<\/td>\n<td class=\"org-left\">Allen, Mr. William Henry<\/td>\n<td class=\"org-left\">male<\/td>\n<td class=\"org-right\">35.0<\/td>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-left\">373450<\/td>\n<td class=\"org-right\">8.05<\/td>\n<td class=\"org-right\">nan<\/td>\n<td class=\"org-left\">S<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n\n<p> <a id=\"orgd989390\"><\/a> <\/p>\n<\/div>\n<\/div>\n<div id=\"outline-container-projection-selection-and-extension\" class=\"outline-3\">\n<h3 id=\"projection-selection-and-extension\">Projection, Selection and Extension<\/h3>\n<div class=\"outline-text-3\" id=\"text-projection-selection-and-extension\">\n<p> a data frame is a table; you can get its column names using the <code>.columns<\/code> attributes <\/p>\n\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-python\" id=\"nil\">titanic.columns\n<\/pre>\n<\/div>\n\n<p> Index([&rsquo;PassengerId&rsquo;, &rsquo;Survived&rsquo;, &rsquo;Pclass&rsquo;, &rsquo;Name&rsquo;, &rsquo;Sex&rsquo;, &rsquo;Age&rsquo;, &rsquo;SibSp&rsquo;,        &rsquo;Parch&rsquo;, &rsquo;Ticket&rsquo;, &rsquo;Fare&rsquo;, &rsquo;Cabin&rsquo;, &rsquo;Embarked&rsquo;],       dtype=&rsquo;object&rsquo;) <\/p>\n\n\n<p> <a id=\"orgd7cac84\"><\/a> columns can be accessed individually or in groups; this operation is called <b>projection<\/b> <\/p>\n\n<p> Single columns can be accessed either <\/p>\n\n<ol class=\"org-ol\">\n<li>using a square bracket operator <code>df.[\"age\"]<\/code><\/li>\n<li>using the dot operator if the column name is a good <b>identifier<\/b> <code>df.age<\/code><\/li>\n<\/ol>\n\n<p> Each column is called a <b>Series<\/b> in pandas jargon <\/p>\n\n<p> Groups of columns can be accessed by passing a list of strings to the bracket operator <\/p>\n\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-python\" id=\"nil\">titanic[[<span style=\"color: #a6e3a1;\">\"Survived\"<\/span>,<span style=\"color: #a6e3a1;\">\"Pclass\"<\/span>,<span style=\"color: #a6e3a1;\">\"Sex\"<\/span>,<span style=\"color: #a6e3a1;\">\"Age\"<\/span>]].head()\n<\/pre>\n<\/div>\n\n<table border=\"2\" cellspacing=\"0\" cellpadding=\"6\" rules=\"groups\" frame=\"hsides\">\n\n\n<colgroup>\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-right\" \/>\n<\/colgroup>\n<thead>\n<tr>\n<th scope=\"col\" class=\"org-right\">&#xa0;<\/th>\n<th scope=\"col\" class=\"org-right\">Survived<\/th>\n<th scope=\"col\" class=\"org-right\">Pclass<\/th>\n<th scope=\"col\" class=\"org-left\">Sex<\/th>\n<th scope=\"col\" class=\"org-right\">Age<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-right\">3<\/td>\n<td class=\"org-left\">male<\/td>\n<td class=\"org-right\">22.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-left\">female<\/td>\n<td class=\"org-right\">38.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">2<\/td>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-right\">3<\/td>\n<td class=\"org-left\">female<\/td>\n<td class=\"org-right\">26.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">3<\/td>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-left\">female<\/td>\n<td class=\"org-right\">35.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">4<\/td>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-right\">3<\/td>\n<td class=\"org-left\">male<\/td>\n<td class=\"org-right\">35.0<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n\n<p> <a id=\"org29f31f5\"><\/a> operations on series are vectorized i.e. each individual element is used to get a new vector <\/p>\n\n<p> Operations within a series and a scalar value are repeated for all values of a series <\/p>\n\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-python\" id=\"nil\">titanic.Pclass <span style=\"color: #89dceb;\">==<\/span> <span style=\"color: #fab387;\">1<\/span>\n<\/pre>\n<\/div>\n\n<p> returns a series of booleans <\/p>\n\n<p> By passing a list of booleans to the square bracket operators this filters all of the lines which are satisfying the logic statement expressed; this operation is called <b>selection<\/b> which is a synonim for filter <\/p>\n\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-python\" id=\"nil\">titanic[titanic.Pclass <span style=\"color: #89dceb;\">==<\/span> <span style=\"color: #fab387;\">1<\/span>].head()\n<\/pre>\n<\/div>\n\n<table border=\"2\" cellspacing=\"0\" cellpadding=\"6\" rules=\"groups\" frame=\"hsides\">\n\n\n<colgroup>\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-left\" \/>\n<\/colgroup>\n<thead>\n<tr>\n<th scope=\"col\" class=\"org-right\">&#xa0;<\/th>\n<th scope=\"col\" class=\"org-right\">PassengerId<\/th>\n<th scope=\"col\" class=\"org-right\">Survived<\/th>\n<th scope=\"col\" class=\"org-right\">Pclass<\/th>\n<th scope=\"col\" class=\"org-left\">Name<\/th>\n<th scope=\"col\" class=\"org-left\">Sex<\/th>\n<th scope=\"col\" class=\"org-right\">Age<\/th>\n<th scope=\"col\" class=\"org-right\">SibSp<\/th>\n<th scope=\"col\" class=\"org-right\">Parch<\/th>\n<th scope=\"col\" class=\"org-right\">Ticket<\/th>\n<th scope=\"col\" class=\"org-right\">Fare<\/th>\n<th scope=\"col\" class=\"org-left\">Cabin<\/th>\n<th scope=\"col\" class=\"org-left\">Embarked<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-right\">2<\/td>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-left\">Cumings, Mrs. John Bradley (Florence Briggs Thayer)<\/td>\n<td class=\"org-left\">female<\/td>\n<td class=\"org-right\">38.0<\/td>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-right\">PC 17599<\/td>\n<td class=\"org-right\">71.2833<\/td>\n<td class=\"org-left\">C85<\/td>\n<td class=\"org-left\">C<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">3<\/td>\n<td class=\"org-right\">4<\/td>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-left\">Futrelle, Mrs. Jacques Heath (Lily May Peel)<\/td>\n<td class=\"org-left\">female<\/td>\n<td class=\"org-right\">35.0<\/td>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-right\">113803<\/td>\n<td class=\"org-right\">53.1<\/td>\n<td class=\"org-left\">C123<\/td>\n<td class=\"org-left\">S<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">6<\/td>\n<td class=\"org-right\">7<\/td>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-left\">McCarthy, Mr. Timothy J<\/td>\n<td class=\"org-left\">male<\/td>\n<td class=\"org-right\">54.0<\/td>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-right\">17463<\/td>\n<td class=\"org-right\">51.8625<\/td>\n<td class=\"org-left\">E46<\/td>\n<td class=\"org-left\">S<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">11<\/td>\n<td class=\"org-right\">12<\/td>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-left\">Bonnell, Miss. Elizabeth<\/td>\n<td class=\"org-left\">female<\/td>\n<td class=\"org-right\">58.0<\/td>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-right\">113783<\/td>\n<td class=\"org-right\">26.55<\/td>\n<td class=\"org-left\">C103<\/td>\n<td class=\"org-left\">S<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">23<\/td>\n<td class=\"org-right\">24<\/td>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-left\">Sloper, Mr. William Thompson<\/td>\n<td class=\"org-left\">male<\/td>\n<td class=\"org-right\">28.0<\/td>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-right\">113788<\/td>\n<td class=\"org-right\">35.5<\/td>\n<td class=\"org-left\">A6<\/td>\n<td class=\"org-left\">S<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n\n<p> <a id=\"orgc6c2a41\"><\/a> Usually selection and projection are used at the same time; the <code>.loc[,]<\/code> operator can be conveniently used for this purpose; its arguments are: <\/p>\n\n<ol class=\"org-ol\">\n<li>a boolean list for rows or the splice operator <code>:<\/code> for no filter<\/li>\n<li>a string list of column names or the splice operator <code>:<\/code> for all columns<\/li>\n<\/ol>\n\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-python\" id=\"nil\">titanic.loc[titanic.Pclass<span style=\"color: #89dceb;\">==<\/span><span style=\"color: #fab387;\">1<\/span>,[<span style=\"color: #a6e3a1;\">\"Survived\"<\/span>,<span style=\"color: #a6e3a1;\">\"Sex\"<\/span>,<span style=\"color: #a6e3a1;\">\"Age\"<\/span>]].head()\n<\/pre>\n<\/div>\n\n<table border=\"2\" cellspacing=\"0\" cellpadding=\"6\" rules=\"groups\" frame=\"hsides\">\n\n\n<colgroup>\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-right\" \/>\n<\/colgroup>\n<thead>\n<tr>\n<th scope=\"col\" class=\"org-right\">&#xa0;<\/th>\n<th scope=\"col\" class=\"org-right\">Survived<\/th>\n<th scope=\"col\" class=\"org-left\">Sex<\/th>\n<th scope=\"col\" class=\"org-right\">Age<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-left\">female<\/td>\n<td class=\"org-right\">38.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">3<\/td>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-left\">female<\/td>\n<td class=\"org-right\">35.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">6<\/td>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-left\">male<\/td>\n<td class=\"org-right\">54.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">11<\/td>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-left\">female<\/td>\n<td class=\"org-right\">58.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">23<\/td>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-left\">male<\/td>\n<td class=\"org-right\">28.0<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n\n<p> <a id=\"org96ed70e\"><\/a> It is possible to extend a table with more columns possibly as a result from a computation in other columns <\/p>\n\n<p> To create a new column, just assign an expression to a new column name e.g. <\/p>\n\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-python\" id=\"nil\"><span style=\"color: #cdd6f4;\">df<\/span>[<span style=\"color: #a6e3a1;\">\"above_average\"<\/span>] <span style=\"color: #89dceb;\">=<\/span> (df.score <span style=\"color: #89dceb;\">&gt;<\/span> df.score.mean())\n<\/pre>\n<\/div>\n\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-python\" id=\"nil\"><span style=\"color: #cdd6f4;\">countries<\/span> <span style=\"color: #89dceb;\">=<\/span> pd.read_csv(<span style=\"color: #a6e3a1;\">\"datasets\/WorldDBTables\/CountryTable.csv\"<\/span>)\ncountries.columns\n<\/pre>\n<\/div>\n\n<p> Index([&rsquo;code&rsquo;, &rsquo;name&rsquo;, &rsquo;continent&rsquo;, &rsquo;region&rsquo;, &rsquo;surface_area&rsquo;,        &rsquo;independence_year&rsquo;, &rsquo;population&rsquo;, &rsquo;life_expectancy&rsquo;, &rsquo;gnp&rsquo;, &rsquo;gnp_old&rsquo;,        &rsquo;local_name&rsquo;, &rsquo;government_form&rsquo;, &rsquo;head_of_state&rsquo;, &rsquo;capital&rsquo;, &rsquo;code2&rsquo;],       dtype=&rsquo;object&rsquo;) <\/p>\n\n<p> <a id=\"org5a8401a\"><\/a> <\/p>\n<\/div>\n<div id=\"outline-container-exercise\" class=\"outline-4\">\n<h4 id=\"exercise\">Exercise<\/h4>\n<div class=\"outline-text-4\" id=\"text-exercise\">\n<p> calculate the population density of each country <\/p>\n\n<p> The countries table contains the population size in the <code>population<\/code> column and the land extension in the <code>surface area<\/code> <\/p>\n\n<ol class=\"org-ol\">\n<li>calculate the ratio of these two columns and store it in a new column called <code>population density<\/code><\/li>\n<\/ol>\n\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-python\" id=\"nil\"><span style=\"color: #cdd6f4;\">countries<\/span>[<span style=\"color: #a6e3a1;\">\"population_density\"<\/span>] <span style=\"color: #89dceb;\">=<\/span> countries.population <span style=\"color: #89dceb;\">\/<\/span>  countries.surface_area\ncountries.loc[:,[<span style=\"color: #a6e3a1;\">\"name\"<\/span>,<span style=\"color: #a6e3a1;\">\"population_density\"<\/span>]].head()\n<\/pre>\n<\/div>\n\n<table border=\"2\" cellspacing=\"0\" cellpadding=\"6\" rules=\"groups\" frame=\"hsides\">\n\n\n<colgroup>\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-right\" \/>\n<\/colgroup>\n<thead>\n<tr>\n<th scope=\"col\" class=\"org-right\">&#xa0;<\/th>\n<th scope=\"col\" class=\"org-left\">name<\/th>\n<th scope=\"col\" class=\"org-right\">population_density<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-left\">Aruba<\/td>\n<td class=\"org-right\">533.6787564766839<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-left\">Afghanistan<\/td>\n<td class=\"org-right\">34.84181631369903<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">2<\/td>\n<td class=\"org-left\">Angola<\/td>\n<td class=\"org-right\">10.32967032967033<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">3<\/td>\n<td class=\"org-left\">Anguilla<\/td>\n<td class=\"org-right\">83.33333333333333<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">4<\/td>\n<td class=\"org-left\">Albania<\/td>\n<td class=\"org-right\">118.31083901488799<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n\n<p> <a id=\"org963c687\"><\/a> <\/p>\n\n<ol class=\"org-ol\">\n<li>sort the table in descending order using the <code>.sort_values<\/code> function<\/li>\n<li>restrict the columns to only the <code>[\"name\",\"population_density\"]<\/code> columns<\/li>\n<li>show the first lines of the table using the <code>.head()<\/code> method: what are the most densely populated countries?<\/li>\n<\/ol>\n\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-python\" id=\"nil\">countries.sort_values(<span style=\"color: #a6e3a1;\">\"population_density\"<\/span>,ascending<span style=\"color: #89dceb;\">=<\/span><span style=\"color: #fab387;\">False<\/span>).loc[:,[<span style=\"color: #a6e3a1;\">\"name\"<\/span>,<span style=\"color: #a6e3a1;\">\"population_density\"<\/span>]].head()\n<\/pre>\n<\/div>\n\n<table border=\"2\" cellspacing=\"0\" cellpadding=\"6\" rules=\"groups\" frame=\"hsides\">\n\n\n<colgroup>\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-right\" \/>\n<\/colgroup>\n<thead>\n<tr>\n<th scope=\"col\" class=\"org-right\">&#xa0;<\/th>\n<th scope=\"col\" class=\"org-left\">name<\/th>\n<th scope=\"col\" class=\"org-right\">population_density<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"org-right\">129<\/td>\n<td class=\"org-left\">Macao<\/td>\n<td class=\"org-right\">26277.777777777777<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">131<\/td>\n<td class=\"org-left\">Monaco<\/td>\n<td class=\"org-right\">22666.666666666668<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">92<\/td>\n<td class=\"org-left\">Hong Kong<\/td>\n<td class=\"org-right\">6308.837209302325<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">186<\/td>\n<td class=\"org-left\">Singapore<\/td>\n<td class=\"org-right\">5771.844660194175<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">79<\/td>\n<td class=\"org-left\">Gibraltar<\/td>\n<td class=\"org-right\">4166.666666666667<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n\n<p> <a id=\"orgfd40cff\"><\/a> <\/p>\n<\/div>\n<\/div>\n<\/div>\n<div id=\"outline-container-join-and-concatenation\" class=\"outline-3\">\n<h3 id=\"join-and-concatenation\">Join and concatenation<\/h3>\n<div class=\"outline-text-3\" id=\"text-join-and-concatenation\">\n<p> <a id=\"orgeb2c640\"><\/a> A relation may be composed by more than a table; this may offer some consistency and operation efficiency. <\/p>\n\n<p> If two tables represents entitites which are related they can be <b>joined<\/b> by selecting one or more columns which contains those attributes which creates the relationship. <\/p>\n\n<p> Per each matched rows in a table, this is replicated as many times as the matched rows in the other table <\/p>\n\n<p> There are four kinds of available joins <\/p>\n\n<table border=\"2\" cellspacing=\"0\" cellpadding=\"6\" rules=\"groups\" frame=\"hsides\">\n\n\n<colgroup>\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-left\" \/>\n<\/colgroup>\n<thead>\n<tr>\n<th scope=\"col\" class=\"org-left\">join<\/th>\n<th scope=\"col\" class=\"org-left\">data incuded<\/th>\n<th scope=\"col\" class=\"org-left\">added missing values<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"org-left\">inner<\/td>\n<td class=\"org-left\">only rows which belongs to both tables<\/td>\n<td class=\"org-left\">none<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">left<\/td>\n<td class=\"org-left\">all rows of the first table<\/td>\n<td class=\"org-left\">for all non matching rows in the first table<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">right<\/td>\n<td class=\"org-left\">all rows of the second table<\/td>\n<td class=\"org-left\">for all non matching rows in the second table<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">outer<\/td>\n<td class=\"org-left\">all rows of both tables<\/td>\n<td class=\"org-left\">for all non matching rows<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n\n<p> Suppose we have a list of courses, classrooms and classroom booking per each course; if we want to know where each professor should hold his lesson we need to join these tables <\/p>\n\n<table border=\"2\" cellspacing=\"0\" cellpadding=\"6\" rules=\"groups\" frame=\"hsides\">\n\n\n<colgroup>\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-left\" \/>\n<\/colgroup>\n<thead>\n<tr>\n<th scope=\"col\" class=\"org-right\">course_id<\/th>\n<th scope=\"col\" class=\"org-left\">title<\/th>\n<th scope=\"col\" class=\"org-left\">professor<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-left\">quantum field theory<\/td>\n<td class=\"org-left\">Bohr<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">2<\/td>\n<td class=\"org-left\">thermodynamics<\/td>\n<td class=\"org-left\">Carnot<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">3<\/td>\n<td class=\"org-left\">statistics<\/td>\n<td class=\"org-left\">Gosset<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n\n<table border=\"2\" cellspacing=\"0\" cellpadding=\"6\" rules=\"groups\" frame=\"hsides\">\n\n\n<colgroup>\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-right\" \/>\n<\/colgroup>\n<thead>\n<tr>\n<th scope=\"col\" class=\"org-left\">classroom_id<\/th>\n<th scope=\"col\" class=\"org-left\">building<\/th>\n<th scope=\"col\" class=\"org-right\">floor<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"org-left\">p124<\/td>\n<td class=\"org-left\">Purple<\/td>\n<td class=\"org-right\">1<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">r201<\/td>\n<td class=\"org-left\">Red<\/td>\n<td class=\"org-right\">2<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n\n<table border=\"2\" cellspacing=\"0\" cellpadding=\"6\" rules=\"groups\" frame=\"hsides\">\n\n\n<colgroup>\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n<\/colgroup>\n<thead>\n<tr>\n<th scope=\"col\" class=\"org-right\">course_id<\/th>\n<th scope=\"col\" class=\"org-left\">classroom_id<\/th>\n<th scope=\"col\" class=\"org-left\">weekday<\/th>\n<th scope=\"col\" class=\"org-right\">start<\/th>\n<th scope=\"col\" class=\"org-right\">end<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-left\">p124<\/td>\n<td class=\"org-left\">Monday<\/td>\n<td class=\"org-right\">9<\/td>\n<td class=\"org-right\">11<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-left\">r201<\/td>\n<td class=\"org-left\">Wednesday<\/td>\n<td class=\"org-right\">14<\/td>\n<td class=\"org-right\">15<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">2<\/td>\n<td class=\"org-left\">r201<\/td>\n<td class=\"org-left\">Tuesday<\/td>\n<td class=\"org-right\">14<\/td>\n<td class=\"org-right\">17<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">3<\/td>\n<td class=\"org-left\">r201<\/td>\n<td class=\"org-left\">Monday<\/td>\n<td class=\"org-right\">14<\/td>\n<td class=\"org-right\">15<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">3<\/td>\n<td class=\"org-left\">p124<\/td>\n<td class=\"org-left\">Tuesday<\/td>\n<td class=\"org-right\">9<\/td>\n<td class=\"org-right\">10<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">3<\/td>\n<td class=\"org-left\">p124<\/td>\n<td class=\"org-left\">Wednesday<\/td>\n<td class=\"org-right\">9<\/td>\n<td class=\"org-right\">10<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n\n<p> the <code>pd.merge()<\/code> function performs the join operation e.g. <\/p>\n\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-python\" id=\"nil\"><span style=\"color: #cdd6f4;\">courses_classrooms<\/span> <span style=\"color: #89dceb;\">=<\/span> pd.merge(courses,classroom)\n<span style=\"color: #cdd6f4;\">courses_bookings<\/span> <span style=\"color: #89dceb;\">=<\/span> ps.merge(courses_classroom, bookings)\n<\/pre>\n<\/div>\n\n<p> The default kind of join is <code>inner<\/code> you can use the <code>how=<\/code> optional argument to choose another kind. <\/p>\n\n<p> <code>pd.merge<\/code> will join by default all columns with identical name: if you want to restrict the join to a given list of column you can use the <code>on=<\/code> option. <\/p>\n\n<p> If you have different names for the join columns you can use <code>left_on=<\/code> and <code>right_on=<\/code> options to match them. <\/p>\n<\/div>\n<div id=\"outline-container-exercise\" class=\"outline-4\">\n<h4 id=\"exercise\">Exercise<\/h4>\n<div class=\"outline-text-4\" id=\"text-exercise\">\n<ul class=\"org-ul\">\n<li>in the country table we have a list of countries including their population<\/li>\n<li>in the languages table we have a list of languages spoken in each country and the percentage of the population which speaks said language<\/li>\n<li>in the country table we have a textual <code>code<\/code> which is uniquely assigned to each county<\/li>\n<li>in the languages table we have the same code in a column called <code>country_code<\/code><\/li>\n\n<li>load the language table from <code>datasets\/WorldDBTables\/LanguageTable.csv<\/code> using the <code>pd.read_csv<\/code> function and store it in a variable called <code>languages<\/code><\/li>\n<li>create a table named <code>language_by_country<\/code> using the <code>pd.merge<\/code> function and joining the column <code>code<\/code> of table <code>countries<\/code> with the column <code>country_code<\/code> from the <code>languages<\/code> table<\/li>\n<li>calculate the number of people speaking a language by multiplying the <code>population<\/code> column with the <code>percentage<\/code> column (don&rsquo;t forget to divide by 100!); put the result in a column called <code>poeple_speaking<\/code><\/li>\n<li>show some lines of the table keeping only the following columns: <code>[\"name\",\"language\",\"people_speaking\",\"official\"]<\/code> what do you see?<\/li>\n<\/ul>\n\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-python\" id=\"nil\"><span style=\"color: #cdd6f4;\">languages<\/span> <span style=\"color: #89dceb;\">=<\/span> pd.read_csv(<span style=\"color: #a6e3a1;\">\"datasets\/WorldDBTables\/LanguageTable.csv\"<\/span>)\n<\/pre>\n<\/div>\n\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-python\" id=\"nil\"><span style=\"color: #cdd6f4;\">languages_by_country<\/span> <span style=\"color: #89dceb;\">=<\/span> pd.merge(\n    countries, languages, \n    how<span style=\"color: #89dceb;\">=<\/span><span style=\"color: #a6e3a1;\">\"inner\"<\/span>, \n    left_on<span style=\"color: #89dceb;\">=<\/span>[<span style=\"color: #a6e3a1;\">\"code\"<\/span>], right_on<span style=\"color: #89dceb;\">=<\/span>[<span style=\"color: #a6e3a1;\">\"country_code\"<\/span>]\n)\n<span style=\"color: #cdd6f4;\">languages_by_country<\/span>[<span style=\"color: #a6e3a1;\">\"people_speaking\"<\/span>] <span style=\"color: #89dceb;\">=<\/span> languages_by_country.population <span style=\"color: #89dceb;\">*<\/span> \\\n    languages_by_country.percentage <span style=\"color: #89dceb;\">\/<\/span> <span style=\"color: #fab387;\">100<\/span>\nlanguages_by_country[[<span style=\"color: #a6e3a1;\">\"name\"<\/span>,<span style=\"color: #a6e3a1;\">\"language\"<\/span>,<span style=\"color: #a6e3a1;\">\"people_speaking\"<\/span>,<span style=\"color: #a6e3a1;\">\"official\"<\/span>]].head()\n<\/pre>\n<\/div>\n\n<table border=\"2\" cellspacing=\"0\" cellpadding=\"6\" rules=\"groups\" frame=\"hsides\">\n\n\n<colgroup>\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-left\" \/>\n<\/colgroup>\n<thead>\n<tr>\n<th scope=\"col\" class=\"org-right\">&#xa0;<\/th>\n<th scope=\"col\" class=\"org-left\">name<\/th>\n<th scope=\"col\" class=\"org-left\">language<\/th>\n<th scope=\"col\" class=\"org-right\">people_speaking<\/th>\n<th scope=\"col\" class=\"org-left\">official<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-left\">Aruba<\/td>\n<td class=\"org-left\">Dutch<\/td>\n<td class=\"org-right\">5459.0<\/td>\n<td class=\"org-left\">T<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-left\">Aruba<\/td>\n<td class=\"org-left\">English<\/td>\n<td class=\"org-right\">9785.0<\/td>\n<td class=\"org-left\">F<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">2<\/td>\n<td class=\"org-left\">Aruba<\/td>\n<td class=\"org-left\">Papiamento<\/td>\n<td class=\"org-right\">79001.0<\/td>\n<td class=\"org-left\">F<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">3<\/td>\n<td class=\"org-left\">Aruba<\/td>\n<td class=\"org-left\">Spanish<\/td>\n<td class=\"org-right\">7622.0<\/td>\n<td class=\"org-left\">F<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">4<\/td>\n<td class=\"org-left\">Afghanistan<\/td>\n<td class=\"org-left\">Balochi<\/td>\n<td class=\"org-right\">204480.0<\/td>\n<td class=\"org-left\">F<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n\n<p> <a id=\"org5426b97\"><\/a> <\/p>\n<\/div>\n<\/div>\n<div id=\"outline-container-concatenation\" class=\"outline-4\">\n<h4 id=\"concatenation\">Concatenation<\/h4>\n<div class=\"outline-text-4\" id=\"text-concatenation\">\n<p> It may happen that your data is collected in separated dataframes with the same columns ans you need to create a single one from all of them. <\/p>\n\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-bash\" id=\"nil\">unzip ROMA.zip TG_SOUID100860.txt\n<\/pre>\n<\/div>\n\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-bash\" id=\"nil\">unzip BARI.zip TG_SOUID245914.txt\n<\/pre>\n<\/div>\n\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-python\" id=\"nil\"><span style=\"color: #cdd6f4;\">roma<\/span> <span style=\"color: #89dceb;\">=<\/span> pd.read_csv(<span style=\"color: #a6e3a1;\">\"TG_SOUID100860.txt\"<\/span>,skiprows<span style=\"color: #89dceb;\">=<\/span><span style=\"color: #fab387;\">20<\/span>)\nroma.head()\n<\/pre>\n<\/div>\n\n<table border=\"2\" cellspacing=\"0\" cellpadding=\"6\" rules=\"groups\" frame=\"hsides\">\n\n\n<colgroup>\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n<\/colgroup>\n<thead>\n<tr>\n<th scope=\"col\" class=\"org-right\">&#xa0;<\/th>\n<th scope=\"col\" class=\"org-right\">SOUID<\/th>\n<th scope=\"col\" class=\"org-right\">DATE<\/th>\n<th scope=\"col\" class=\"org-right\">TG<\/th>\n<th scope=\"col\" class=\"org-right\">Q_TG<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-right\">100860<\/td>\n<td class=\"org-right\">19510101<\/td>\n<td class=\"org-right\">76<\/td>\n<td class=\"org-right\">0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-right\">100860<\/td>\n<td class=\"org-right\">19510102<\/td>\n<td class=\"org-right\">108<\/td>\n<td class=\"org-right\">0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">2<\/td>\n<td class=\"org-right\">100860<\/td>\n<td class=\"org-right\">19510103<\/td>\n<td class=\"org-right\">116<\/td>\n<td class=\"org-right\">0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">3<\/td>\n<td class=\"org-right\">100860<\/td>\n<td class=\"org-right\">19510104<\/td>\n<td class=\"org-right\">115<\/td>\n<td class=\"org-right\">0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">4<\/td>\n<td class=\"org-right\">100860<\/td>\n<td class=\"org-right\">19510105<\/td>\n<td class=\"org-right\">82<\/td>\n<td class=\"org-right\">0<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-python\" id=\"nil\"><span style=\"color: #cdd6f4;\">bari<\/span> <span style=\"color: #89dceb;\">=<\/span> pd.read_csv(<span style=\"color: #a6e3a1;\">\"TG_SOUID245914.txt\"<\/span>,skiprows<span style=\"color: #89dceb;\">=<\/span><span style=\"color: #fab387;\">20<\/span>)\nbari.head()\n<\/pre>\n<\/div>\n\n<table border=\"2\" cellspacing=\"0\" cellpadding=\"6\" rules=\"groups\" frame=\"hsides\">\n\n\n<colgroup>\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n<\/colgroup>\n<thead>\n<tr>\n<th scope=\"col\" class=\"org-right\">&#xa0;<\/th>\n<th scope=\"col\" class=\"org-right\">SOUID<\/th>\n<th scope=\"col\" class=\"org-right\">DATE<\/th>\n<th scope=\"col\" class=\"org-right\">TG<\/th>\n<th scope=\"col\" class=\"org-right\">Q_TG<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-right\">245914<\/td>\n<td class=\"org-right\">20211201<\/td>\n<td class=\"org-right\">-9999<\/td>\n<td class=\"org-right\">9<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-right\">245914<\/td>\n<td class=\"org-right\">20211202<\/td>\n<td class=\"org-right\">-9999<\/td>\n<td class=\"org-right\">9<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">2<\/td>\n<td class=\"org-right\">245914<\/td>\n<td class=\"org-right\">20211203<\/td>\n<td class=\"org-right\">-9999<\/td>\n<td class=\"org-right\">9<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">3<\/td>\n<td class=\"org-right\">245914<\/td>\n<td class=\"org-right\">20211204<\/td>\n<td class=\"org-right\">-9999<\/td>\n<td class=\"org-right\">9<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">4<\/td>\n<td class=\"org-right\">245914<\/td>\n<td class=\"org-right\">20211205<\/td>\n<td class=\"org-right\">-9999<\/td>\n<td class=\"org-right\">9<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n\n<p> <a id=\"org367fdf8\"><\/a> the <code>pd.concat()<\/code> function can concatenate a list of data frames; the default behavior is consistent with the semantic of relations and it retunrns a single data frame: <\/p>\n\n<ul class=\"org-ul\">\n<li>columns will be the union of all columns of each individual data frame in the input<\/li>\n<li>rows will keep the same order as the data frames<\/li>\n<\/ul>\n\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-python\" id=\"nil\"><span style=\"color: #cdd6f4;\">temperatures<\/span> <span style=\"color: #89dceb;\">=<\/span> pd.concat([roma,bari])\n\n<span style=\"color: #6c7086;\"># <\/span><span style=\"color: #6c7086;\">this will remove extra spaces from column names<\/span>\ntemperatures.<span style=\"color: #cdd6f4;\">columns<\/span> <span style=\"color: #89dceb;\">=<\/span> <span style=\"color: #f38ba8;\">list<\/span>(<span style=\"color: #f38ba8;\">map<\/span>(<span style=\"color: #f38ba8;\">str<\/span>.strip,temperatures.columns))\n\n<span style=\"color: #6c7086;\"># <\/span><span style=\"color: #6c7086;\">this will transform the column type<\/span>\n<span style=\"color: #cba6f7;\">for<\/span> col <span style=\"color: #cba6f7;\">in<\/span> [<span style=\"color: #a6e3a1;\">\"SOUID\"<\/span>,<span style=\"color: #a6e3a1;\">\"Q_TG\"<\/span>]:\n    <span style=\"color: #cdd6f4;\">temperatures<\/span>[col] <span style=\"color: #89dceb;\">=<\/span> temperatures[col].astype(<span style=\"color: #a6e3a1;\">\"category\"<\/span>)\n<span style=\"color: #cdd6f4;\">temperatures<\/span>[<span style=\"color: #a6e3a1;\">\"DATE\"<\/span>]<span style=\"color: #89dceb;\">=<\/span>pd.to_datetime(temperatures[<span style=\"color: #a6e3a1;\">\"DATE\"<\/span>],<span style=\"color: #f38ba8;\">format<\/span><span style=\"color: #89dceb;\">=<\/span><span style=\"color: #a6e3a1;\">\"%Y%m%d\"<\/span>)\n<span style=\"color: #f38ba8;\">print<\/span>(temperatures.Q_TG.unique())\ntemperatures.loc[temperatures.Q_TG <span style=\"color: #89dceb;\">!=<\/span> <span style=\"color: #fab387;\">9<\/span>,:].describe(include<span style=\"color: #89dceb;\">=<\/span><span style=\"color: #a6e3a1;\">\"all\"<\/span>)\n<\/pre>\n<\/div>\n\n<table border=\"2\" cellspacing=\"0\" cellpadding=\"6\" rules=\"groups\" frame=\"hsides\">\n\n\n<colgroup>\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n<\/colgroup>\n<thead>\n<tr>\n<th scope=\"col\" class=\"org-left\">&#xa0;<\/th>\n<th scope=\"col\" class=\"org-right\">SOUID<\/th>\n<th scope=\"col\" class=\"org-left\">DATE<\/th>\n<th scope=\"col\" class=\"org-left\">TG<\/th>\n<th scope=\"col\" class=\"org-right\">Q_TG<\/th>\n<th scope=\"col\" class=\"org-right\">&#xa0;<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"org-left\">count<\/td>\n<td class=\"org-right\">21717.0<\/td>\n<td class=\"org-left\">np.int64<\/td>\n<td class=\"org-left\">(21717)<\/td>\n<td class=\"org-right\">21717.0<\/td>\n<td class=\"org-right\">21717.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">unique<\/td>\n<td class=\"org-right\">2.0<\/td>\n<td class=\"org-left\">nan<\/td>\n<td class=\"org-left\">nan<\/td>\n<td class=\"org-right\">2.0<\/td>\n<td class=\"org-right\">&#xa0;<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">top<\/td>\n<td class=\"org-right\">100860.0<\/td>\n<td class=\"org-left\">nan<\/td>\n<td class=\"org-left\">nan<\/td>\n<td class=\"org-right\">0.0<\/td>\n<td class=\"org-right\">&#xa0;<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">freq<\/td>\n<td class=\"org-right\">21525.0<\/td>\n<td class=\"org-left\">nan<\/td>\n<td class=\"org-left\">nan<\/td>\n<td class=\"org-right\">21711.0<\/td>\n<td class=\"org-right\">&#xa0;<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">mean<\/td>\n<td class=\"org-right\">nan<\/td>\n<td class=\"org-left\">Timestamp<\/td>\n<td class=\"org-left\">(1980-11-11 03:17:47.716535360)<\/td>\n<td class=\"org-right\">154.8837316388083<\/td>\n<td class=\"org-right\">nan<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">min<\/td>\n<td class=\"org-right\">nan<\/td>\n<td class=\"org-left\">Timestamp<\/td>\n<td class=\"org-left\">(1951-01-01 00:00:00)<\/td>\n<td class=\"org-right\">-56.0<\/td>\n<td class=\"org-right\">nan<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">25%<\/td>\n<td class=\"org-right\">nan<\/td>\n<td class=\"org-left\">Timestamp<\/td>\n<td class=\"org-left\">(1965-11-17 00:00:00)<\/td>\n<td class=\"org-right\">101.0<\/td>\n<td class=\"org-right\">nan<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">50%<\/td>\n<td class=\"org-right\">nan<\/td>\n<td class=\"org-left\">Timestamp<\/td>\n<td class=\"org-left\">(1980-09-28 00:00:00)<\/td>\n<td class=\"org-right\">150.0<\/td>\n<td class=\"org-right\">nan<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">75%<\/td>\n<td class=\"org-right\">nan<\/td>\n<td class=\"org-left\">Timestamp<\/td>\n<td class=\"org-left\">(1995-08-22 00:00:00)<\/td>\n<td class=\"org-right\">212.0<\/td>\n<td class=\"org-right\">nan<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">max<\/td>\n<td class=\"org-right\">nan<\/td>\n<td class=\"org-left\">Timestamp<\/td>\n<td class=\"org-left\">(2022-10-18 00:00:00)<\/td>\n<td class=\"org-right\">327.0<\/td>\n<td class=\"org-right\">nan<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">std<\/td>\n<td class=\"org-right\">nan<\/td>\n<td class=\"org-left\">nan<\/td>\n<td class=\"org-left\">66.53937042433274<\/td>\n<td class=\"org-right\">nan<\/td>\n<td class=\"org-right\">&#xa0;<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n\n<em><\/em>\n<pre class=\"example\" id=\"nil\">\n[0, 9, 1]\nCategories (3, int64): [0, 1, 9]\n<\/pre>\n\n<p> <a id=\"orgf699dc1\"><\/a> <\/p>\n<\/div>\n<\/div>\n<\/div>\n<div id=\"outline-container-aggregation\" class=\"outline-3\">\n<h3 id=\"aggregation\">Aggregation<\/h3>\n<div class=\"outline-text-3\" id=\"text-aggregation\">\n<p> very often you may want to group your data according to one or more attribute and perform some calculation on each group, this operation is called <b>aggregation<\/b> <\/p>\n\n<p> e.g. suppose I want to split a restaurant bill with my friends and I have a dataframe which looks like the following table <\/p>\n\n<table border=\"2\" cellspacing=\"0\" cellpadding=\"6\" rules=\"groups\" frame=\"hsides\">\n\n\n<colgroup>\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-right\" \/>\n<\/colgroup>\n<thead>\n<tr>\n<th scope=\"col\" class=\"org-left\">person<\/th>\n<th scope=\"col\" class=\"org-left\">item<\/th>\n<th scope=\"col\" class=\"org-right\">amount<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"org-left\">me<\/td>\n<td class=\"org-left\">pepperoni pizza<\/td>\n<td class=\"org-right\">12<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">me<\/td>\n<td class=\"org-left\">lager pils<\/td>\n<td class=\"org-right\">5<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">andrea<\/td>\n<td class=\"org-left\">cheeseburger<\/td>\n<td class=\"org-right\">10<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">andrea<\/td>\n<td class=\"org-left\">coca cola<\/td>\n<td class=\"org-right\">2<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">andrea<\/td>\n<td class=\"org-left\">french fries<\/td>\n<td class=\"org-right\">2<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-python\" id=\"nil\"><span style=\"color: #cdd6f4;\">groups<\/span> <span style=\"color: #89dceb;\">=<\/span> bill.groupby([<span style=\"color: #a6e3a1;\">\"person\"<\/span>])\ngroups.agg({<span style=\"color: #a6e3a1;\">\"amount\"<\/span>:<span style=\"color: #a6e3a1;\">\"sum\"<\/span>})\n<\/pre>\n<\/div>\n\n<p> will return <\/p>\n\n<table border=\"2\" cellspacing=\"0\" cellpadding=\"6\" rules=\"groups\" frame=\"hsides\">\n\n\n<colgroup>\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-right\" \/>\n<\/colgroup>\n<thead>\n<tr>\n<th scope=\"col\" class=\"org-left\">person<\/th>\n<th scope=\"col\" class=\"org-right\">amount<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"org-left\">me<\/td>\n<td class=\"org-right\">17<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">andrea<\/td>\n<td class=\"org-right\">14<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n\n<p> It is also possible to ask for multiple aggregation by using a list of functions <\/p>\n\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-python\" id=\"nil\"><span style=\"color: #cdd6f4;\">g<\/span> <span style=\"color: #89dceb;\">=<\/span> titanic.groupby([<span style=\"color: #a6e3a1;\">\"Pclass\"<\/span>,<span style=\"color: #a6e3a1;\">\"Sex\"<\/span>])\n<span style=\"color: #cdd6f4;\">age_summary<\/span> <span style=\"color: #89dceb;\">=<\/span> g.agg({<span style=\"color: #a6e3a1;\">\"Age\"<\/span>:[<span style=\"color: #a6e3a1;\">\"min\"<\/span>,<span style=\"color: #a6e3a1;\">\"max\"<\/span>,<span style=\"color: #a6e3a1;\">\"mean\"<\/span>]})\nage_summary\n<\/pre>\n<\/div>\n\n<table border=\"2\" cellspacing=\"0\" cellpadding=\"6\" rules=\"groups\" frame=\"hsides\">\n\n\n<colgroup>\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n<\/colgroup>\n<thead>\n<tr>\n<th scope=\"col\" class=\"org-left\">&#xa0;<\/th>\n<th scope=\"col\" class=\"org-right\">(Age min)<\/th>\n<th scope=\"col\" class=\"org-right\">(Age max)<\/th>\n<th scope=\"col\" class=\"org-right\">(Age mean)<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"org-left\">(1 female)<\/td>\n<td class=\"org-right\">2.0<\/td>\n<td class=\"org-right\">63.0<\/td>\n<td class=\"org-right\">34.61176470588235<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">(1 male)<\/td>\n<td class=\"org-right\">0.92<\/td>\n<td class=\"org-right\">80.0<\/td>\n<td class=\"org-right\">41.28138613861386<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">(2 female)<\/td>\n<td class=\"org-right\">2.0<\/td>\n<td class=\"org-right\">57.0<\/td>\n<td class=\"org-right\">28.722972972972972<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">(2 male)<\/td>\n<td class=\"org-right\">0.67<\/td>\n<td class=\"org-right\">70.0<\/td>\n<td class=\"org-right\">30.74070707070707<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">(3 female)<\/td>\n<td class=\"org-right\">0.75<\/td>\n<td class=\"org-right\">63.0<\/td>\n<td class=\"org-right\">21.75<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">(3 male)<\/td>\n<td class=\"org-right\">0.42<\/td>\n<td class=\"org-right\">74.0<\/td>\n<td class=\"org-right\">26.507588932806325<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n\n<p> <a id=\"org48f5c16\"><\/a> Please note that here the generated columns are accessible using a tuple i.e. <\/p>\n\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-python\" id=\"nil\">age_summary[(<span style=\"color: #a6e3a1;\">\"Age\"<\/span>,<span style=\"color: #a6e3a1;\">\"mean\"<\/span>)]\n<\/pre>\n<\/div>\n\n<p> <a id=\"orgd39c706\"><\/a> <\/p>\n<\/div>\n<\/div>\n<div id=\"outline-container-exercise\" class=\"outline-3\">\n<h3 id=\"exercise\">Exercise<\/h3>\n<div class=\"outline-text-3\" id=\"text-exercise\">\n<p> using the <code>languages_by_country<\/code> table we created in the previous exercise <\/p>\n\n<ol class=\"org-ol\">\n<li>create a grouping by using the <code>\"language\"<\/code> column<\/li>\n<li>using the <code>.agg()<\/code> method calculate how many people speak each language<\/li>\n<li>sort the dataset from the largest group descending<\/li>\n<li>show the first lines using <code>.head()<\/code> method<\/li>\n<\/ol>\n\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-python\" id=\"nil\"><span style=\"color: #cdd6f4;\">g<\/span> <span style=\"color: #89dceb;\">=<\/span> languages_by_country.groupby([<span style=\"color: #a6e3a1;\">\"language\"<\/span>])\n<span style=\"color: #cdd6f4;\">languages_spoken<\/span> <span style=\"color: #89dceb;\">=<\/span> g.agg({<span style=\"color: #a6e3a1;\">\"people_speaking\"<\/span>:<span style=\"color: #a6e3a1;\">\"sum\"<\/span>})\n<span style=\"color: #cdd6f4;\">languages_spoken_sorted<\/span> <span style=\"color: #89dceb;\">=<\/span> languages_spoken.sort_values(<span style=\"color: #a6e3a1;\">\"people_speaking\"<\/span>,ascending<span style=\"color: #89dceb;\">=<\/span><span style=\"color: #fab387;\">False<\/span>)\nlanguages_spoken_sorted.head(<span style=\"color: #fab387;\">20<\/span>)\n<\/pre>\n<\/div>\n\n<table border=\"2\" cellspacing=\"0\" cellpadding=\"6\" rules=\"groups\" frame=\"hsides\">\n\n\n<colgroup>\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-right\" \/>\n<\/colgroup>\n<thead>\n<tr>\n<th scope=\"col\" class=\"org-left\">language<\/th>\n<th scope=\"col\" class=\"org-right\">people_speaking<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"org-left\">Chinese<\/td>\n<td class=\"org-right\">1190152805.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">Hindi<\/td>\n<td class=\"org-right\">405619174.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">Spanish<\/td>\n<td class=\"org-right\">307997398.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">Bengali<\/td>\n<td class=\"org-right\">209304719.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">Arabic<\/td>\n<td class=\"org-right\">205490840.7<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">Portuguese<\/td>\n<td class=\"org-right\">176981914.4<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">Japanese<\/td>\n<td class=\"org-right\">126254034.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">Punjabi<\/td>\n<td class=\"org-right\">104025371.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">English<\/td>\n<td class=\"org-right\">91616031.3<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">Javanese<\/td>\n<td class=\"org-right\">83570158.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">Telugu<\/td>\n<td class=\"org-right\">79065636.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">Marathi<\/td>\n<td class=\"org-right\">75010988.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">Korean<\/td>\n<td class=\"org-right\">71450757.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">Vietnamese<\/td>\n<td class=\"org-right\">69908416.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">Tamil<\/td>\n<td class=\"org-right\">68682272.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">French<\/td>\n<td class=\"org-right\">67947730.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">Urdu<\/td>\n<td class=\"org-right\">63589470.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">Italian<\/td>\n<td class=\"org-right\">57183654.1<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">Gujarati<\/td>\n<td class=\"org-right\">48655776.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">Malay<\/td>\n<td class=\"org-right\">41517994.0<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-python\" id=\"nil\"><span style=\"color: #cdd6f4;\">g<\/span> <span style=\"color: #89dceb;\">=<\/span> languages_by_country.groupby([<span style=\"color: #a6e3a1;\">\"continent\"<\/span>,<span style=\"color: #a6e3a1;\">\"language\"<\/span>])\n<span style=\"color: #cdd6f4;\">languages_spoken<\/span> <span style=\"color: #89dceb;\">=<\/span> g.agg({<span style=\"color: #a6e3a1;\">\"people_speaking\"<\/span>:<span style=\"color: #a6e3a1;\">\"sum\"<\/span>})\n<span style=\"color: #cdd6f4;\">languages_spoken_sorted<\/span> <span style=\"color: #89dceb;\">=<\/span> languages_spoken.sort_values(<span style=\"color: #a6e3a1;\">\"people_speaking\"<\/span>,ascending<span style=\"color: #89dceb;\">=<\/span><span style=\"color: #fab387;\">False<\/span>)\nlanguages_spoken_sorted.head(<span style=\"color: #fab387;\">20<\/span>)\n<\/pre>\n<\/div>\n\n<table border=\"2\" cellspacing=\"0\" cellpadding=\"6\" rules=\"groups\" frame=\"hsides\">\n\n\n<colgroup>\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-right\" \/>\n<\/colgroup>\n<thead>\n<tr>\n<th scope=\"col\" class=\"org-left\">&#xa0;<\/th>\n<th scope=\"col\" class=\"org-right\">people_speaking<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"org-left\">(Asia Chinese)<\/td>\n<td class=\"org-right\">1189353427.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">(Asia Hindi)<\/td>\n<td class=\"org-right\">405169038.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">(Asia Bengali)<\/td>\n<td class=\"org-right\">209304719.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">(South America Portuguese)<\/td>\n<td class=\"org-right\">166037997.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">(South America Spanish)<\/td>\n<td class=\"org-right\">145620868.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">(Africa Arabic)<\/td>\n<td class=\"org-right\">134392131.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">(North America Spanish)<\/td>\n<td class=\"org-right\">132707046.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">(Asia Japanese)<\/td>\n<td class=\"org-right\">125573574.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">(Asia Punjabi)<\/td>\n<td class=\"org-right\">103807342.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">(Asia Javanese)<\/td>\n<td class=\"org-right\">83570158.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">(Asia Telugu)<\/td>\n<td class=\"org-right\">79065636.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">(Asia Marathi)<\/td>\n<td class=\"org-right\">75010988.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">(Asia Korean)<\/td>\n<td class=\"org-right\">71445687.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">(Asia Vietnamese)<\/td>\n<td class=\"org-right\">69908416.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">(Asia Arabic)<\/td>\n<td class=\"org-right\">69184280.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">(Asia Tamil)<\/td>\n<td class=\"org-right\">68682272.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">(Asia Urdu)<\/td>\n<td class=\"org-right\">63589470.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">(Europe English)<\/td>\n<td class=\"org-right\">61799068.300000004<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">(Europe French)<\/td>\n<td class=\"org-right\">60455448.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">(Europe Italian)<\/td>\n<td class=\"org-right\">55344151.1<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-python\" id=\"nil\"><span style=\"color: #cdd6f4;\">languages_spoken_sorted<\/span><span style=\"color: #89dceb;\">=<\/span>languages_spoken_sorted.reset_index()\n<span style=\"color: #cdd6f4;\">g<\/span> <span style=\"color: #89dceb;\">=<\/span> languages_spoken_sorted.groupby([<span style=\"color: #a6e3a1;\">\"continent\"<\/span>])\n<span style=\"color: #cdd6f4;\">result<\/span> <span style=\"color: #89dceb;\">=<\/span> []\n<span style=\"color: #cba6f7;\">for<\/span> i,subtable <span style=\"color: #cba6f7;\">in<\/span> g:\n    result.append(subtable.head(<span style=\"color: #fab387;\">3<\/span>).reset_index())\npd.concat(result).head()\n<\/pre>\n<\/div>\n\n<table border=\"2\" cellspacing=\"0\" cellpadding=\"6\" rules=\"groups\" frame=\"hsides\">\n\n\n<colgroup>\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-right\" \/>\n<\/colgroup>\n<thead>\n<tr>\n<th scope=\"col\" class=\"org-right\">&#xa0;<\/th>\n<th scope=\"col\" class=\"org-right\">index<\/th>\n<th scope=\"col\" class=\"org-left\">continent<\/th>\n<th scope=\"col\" class=\"org-left\">language<\/th>\n<th scope=\"col\" class=\"org-right\">people_speaking<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-right\">5<\/td>\n<td class=\"org-left\">Africa<\/td>\n<td class=\"org-left\">Arabic<\/td>\n<td class=\"org-right\">134392131.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-right\">32<\/td>\n<td class=\"org-left\">Africa<\/td>\n<td class=\"org-left\">Hausa<\/td>\n<td class=\"org-right\">29225396.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">2<\/td>\n<td class=\"org-right\">33<\/td>\n<td class=\"org-left\">Africa<\/td>\n<td class=\"org-left\">Joruba<\/td>\n<td class=\"org-right\">24868874.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-left\">Asia<\/td>\n<td class=\"org-left\">Chinese<\/td>\n<td class=\"org-right\">1189353427.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-left\">Asia<\/td>\n<td class=\"org-left\">Hindi<\/td>\n<td class=\"org-right\">405169038.0<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n\n<p> <a id=\"orgce7d60d\"><\/a> <\/p>\n<\/div>\n<\/div>\n<div id=\"outline-container-translate-the-content-of-a-table\" class=\"outline-3\">\n<h3 id=\"translate-the-content-of-a-table\">Translate the content of a table<\/h3>\n<div class=\"outline-text-3\" id=\"text-translate-the-content-of-a-table\">\n<p> Suppose we need to translate some foreign language content <\/p>\n\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-python\" id=\"nil\"><span style=\"color: #cdd6f4;\">resources<\/span> <span style=\"color: #89dceb;\">=<\/span> pd.read_csv(<span style=\"color: #a6e3a1;\">\"ds523_consumoacquaenergia.csv\"<\/span>,sep<span style=\"color: #89dceb;\">=<\/span><span style=\"color: #a6e3a1;\">\";\"<\/span>)\nresources.head()\n<\/pre>\n<\/div>\n\n<table border=\"2\" cellspacing=\"0\" cellpadding=\"6\" rules=\"groups\" frame=\"hsides\">\n\n\n<colgroup>\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-right\" \/>\n<\/colgroup>\n<thead>\n<tr>\n<th scope=\"col\" class=\"org-right\">&#xa0;<\/th>\n<th scope=\"col\" class=\"org-right\">anno<\/th>\n<th scope=\"col\" class=\"org-left\">Consumo pro capite tipo<\/th>\n<th scope=\"col\" class=\"org-right\">Consumo pro capite<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-right\">2011<\/td>\n<td class=\"org-left\">Energia elettrica per uso domestico<\/td>\n<td class=\"org-right\">1196.1<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-right\">2011<\/td>\n<td class=\"org-left\">Gas metano per uso domestico e riscaldamento<\/td>\n<td class=\"org-right\">377.9<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">2<\/td>\n<td class=\"org-right\">2011<\/td>\n<td class=\"org-left\">Acqua fatturata per uso domestico<\/td>\n<td class=\"org-right\">83.1<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">3<\/td>\n<td class=\"org-right\">2010<\/td>\n<td class=\"org-left\">Energia elettrica per uso domestico<\/td>\n<td class=\"org-right\">1200.7<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">4<\/td>\n<td class=\"org-right\">2010<\/td>\n<td class=\"org-left\">Gas metano per uso domestico e riscaldamento<\/td>\n<td class=\"org-right\">406.2<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n\n<p> <a id=\"org19926be\"><\/a> The second column looks like a categorical series, so let&rsquo;s check it <\/p>\n\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-python\" id=\"nil\">resources[<span style=\"color: #a6e3a1;\">\"Consumo pro capite tipo\"<\/span>].unique()\n<\/pre>\n<\/div>\n\n<p> [&rsquo;Energia elettrica per uso domestico&rsquo;  &rsquo;Gas metano per uso domestico e riscaldamento&rsquo;  &rsquo;Acqua fatturata per uso domestico&rsquo;] <\/p>\n\n<p> <a id=\"org3d15c7d\"><\/a> we can pass a dictionary to the <code>.map()<\/code> method like this: <\/p>\n\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-python\" id=\"nil\"><span style=\"color: #cdd6f4;\">translate<\/span> <span style=\"color: #89dceb;\">=<\/span> {\n    <span style=\"color: #a6e3a1;\">'Energia elettrica per uso domestico'<\/span>:<span style=\"color: #a6e3a1;\">'electricity'<\/span>,\n    <span style=\"color: #a6e3a1;\">'Gas metano per uso domestico e riscaldamento'<\/span>:<span style=\"color: #a6e3a1;\">'methan'<\/span>,\n    <span style=\"color: #a6e3a1;\">'Acqua fatturata per uso domestico'<\/span>:<span style=\"color: #a6e3a1;\">'water'<\/span>\n}\n<span style=\"color: #cdd6f4;\">resources<\/span>[<span style=\"color: #a6e3a1;\">\"type\"<\/span>] <span style=\"color: #89dceb;\">=<\/span> resources[<span style=\"color: #a6e3a1;\">\"Consumo pro capite tipo\"<\/span>].<span style=\"color: #f38ba8;\">map<\/span>(translate)\n<\/pre>\n<\/div>\n\n<p> <a id=\"org3d65f7f\"><\/a> Also columns can be renamed or removed <\/p>\n\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-python\" id=\"nil\"><span style=\"color: #cdd6f4;\">resources<\/span> <span style=\"color: #89dceb;\">=<\/span> resources.rename({<span style=\"color: #a6e3a1;\">\"anno\"<\/span>:<span style=\"color: #a6e3a1;\">\"year\"<\/span>,<span style=\"color: #a6e3a1;\">\"Consumo pro capite\"<\/span>:<span style=\"color: #a6e3a1;\">\"usage per person\"<\/span>}, axis<span style=\"color: #89dceb;\">=<\/span><span style=\"color: #a6e3a1;\">\"columns\"<\/span>)\n<span style=\"color: #cba6f7;\">del<\/span> resources[<span style=\"color: #a6e3a1;\">\"Consumo pro capite tipo\"<\/span>]\nresources.head()\n<\/pre>\n<\/div>\n\n<table border=\"2\" cellspacing=\"0\" cellpadding=\"6\" rules=\"groups\" frame=\"hsides\">\n\n\n<colgroup>\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-left\" \/>\n<\/colgroup>\n<thead>\n<tr>\n<th scope=\"col\" class=\"org-right\">&#xa0;<\/th>\n<th scope=\"col\" class=\"org-right\">year<\/th>\n<th scope=\"col\" class=\"org-right\">usage per person<\/th>\n<th scope=\"col\" class=\"org-left\">type<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-right\">2011<\/td>\n<td class=\"org-right\">1196.1<\/td>\n<td class=\"org-left\">electricity<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-right\">2011<\/td>\n<td class=\"org-right\">377.9<\/td>\n<td class=\"org-left\">methan<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">2<\/td>\n<td class=\"org-right\">2011<\/td>\n<td class=\"org-right\">83.1<\/td>\n<td class=\"org-left\">water<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">3<\/td>\n<td class=\"org-right\">2010<\/td>\n<td class=\"org-right\">1200.7<\/td>\n<td class=\"org-left\">electricity<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">4<\/td>\n<td class=\"org-right\">2010<\/td>\n<td class=\"org-right\">406.2<\/td>\n<td class=\"org-left\">methan<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n\n<p> <a id=\"org387d7e3\"><\/a> <\/p>\n<\/div>\n<\/div>\n<div id=\"outline-container-pivoting-and-melting\" class=\"outline-3\">\n<h3 id=\"pivoting-and-melting\">Pivoting and melting<\/h3>\n<div class=\"outline-text-3\" id=\"text-pivoting-and-melting\">\n<p> Pivot is a family of aggregation functions whose main purpose is to collect data from a relation and aggregate them by using one or more attribute columns. <\/p>\n\n<p> This process will create a column per each combination of the attributes; the result table is sometime referred as &ldquo;wide format&rdquo; table or &ldquo;two entries table&rdquo;; let&rsquo;s make an example <\/p>\n\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-python\" id=\"nil\"><span style=\"color: #cdd6f4;\">resources2<\/span> <span style=\"color: #89dceb;\">=<\/span> resources.pivot(index<span style=\"color: #89dceb;\">=<\/span><span style=\"color: #a6e3a1;\">\"year\"<\/span>,columns<span style=\"color: #89dceb;\">=<\/span><span style=\"color: #a6e3a1;\">\"type\"<\/span>,values<span style=\"color: #89dceb;\">=<\/span><span style=\"color: #a6e3a1;\">\"usage per person\"<\/span>).reset_index()\nresources2\n<\/pre>\n<\/div>\n\n<table border=\"2\" cellspacing=\"0\" cellpadding=\"6\" rules=\"groups\" frame=\"hsides\">\n\n\n<colgroup>\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n<\/colgroup>\n<thead>\n<tr>\n<th scope=\"col\" class=\"org-right\">&#xa0;<\/th>\n<th scope=\"col\" class=\"org-right\">year<\/th>\n<th scope=\"col\" class=\"org-right\">electricity<\/th>\n<th scope=\"col\" class=\"org-right\">methan<\/th>\n<th scope=\"col\" class=\"org-right\">water<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-right\">2000.0<\/td>\n<td class=\"org-right\">1130.2<\/td>\n<td class=\"org-right\">509.0<\/td>\n<td class=\"org-right\">92.1<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-right\">2001.0<\/td>\n<td class=\"org-right\">1143.9<\/td>\n<td class=\"org-right\">500.7<\/td>\n<td class=\"org-right\">91.3<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">2<\/td>\n<td class=\"org-right\">2002.0<\/td>\n<td class=\"org-right\">1195.5<\/td>\n<td class=\"org-right\">504.2<\/td>\n<td class=\"org-right\">90.4<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">3<\/td>\n<td class=\"org-right\">2003.0<\/td>\n<td class=\"org-right\">1222.8<\/td>\n<td class=\"org-right\">480.2<\/td>\n<td class=\"org-right\">87.3<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">4<\/td>\n<td class=\"org-right\">2004.0<\/td>\n<td class=\"org-right\">1228.6<\/td>\n<td class=\"org-right\">442.4<\/td>\n<td class=\"org-right\">80.4<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">5<\/td>\n<td class=\"org-right\">2005.0<\/td>\n<td class=\"org-right\">1225.0<\/td>\n<td class=\"org-right\">434.5<\/td>\n<td class=\"org-right\">81.3<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">6<\/td>\n<td class=\"org-right\">2006.0<\/td>\n<td class=\"org-right\">1219.7<\/td>\n<td class=\"org-right\">431.3<\/td>\n<td class=\"org-right\">82.2<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">7<\/td>\n<td class=\"org-right\">2007.0<\/td>\n<td class=\"org-right\">1197.0<\/td>\n<td class=\"org-right\">381.1<\/td>\n<td class=\"org-right\">81.6<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">8<\/td>\n<td class=\"org-right\">2008.0<\/td>\n<td class=\"org-right\">1203.0<\/td>\n<td class=\"org-right\">384.9<\/td>\n<td class=\"org-right\">84.5<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">9<\/td>\n<td class=\"org-right\">2009.0<\/td>\n<td class=\"org-right\">1202.9<\/td>\n<td class=\"org-right\">389.6<\/td>\n<td class=\"org-right\">85.8<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">10<\/td>\n<td class=\"org-right\">2010.0<\/td>\n<td class=\"org-right\">1200.7<\/td>\n<td class=\"org-right\">406.2<\/td>\n<td class=\"org-right\">83.2<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">11<\/td>\n<td class=\"org-right\">2011.0<\/td>\n<td class=\"org-right\">1196.1<\/td>\n<td class=\"org-right\">377.9<\/td>\n<td class=\"org-right\">83.1<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n\n<p> <a id=\"org86b11cc\"><\/a> As there was exactly one value per each year and each commodity the previous example just moved values without performing any calculation. <\/p>\n\n<p> Suppose now we want to split some restaurant bill <\/p>\n\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-python\" id=\"nil\"><span style=\"color: #cdd6f4;\">bill<\/span> <span style=\"color: #89dceb;\">=<\/span> pd.DataFrame([\n    {<span style=\"color: #a6e3a1;\">\"item\"<\/span>:<span style=\"color: #a6e3a1;\">\"pepperoni pizza\"<\/span>, <span style=\"color: #a6e3a1;\">\"amount\"<\/span>:<span style=\"color: #fab387;\">12<\/span>, <span style=\"color: #a6e3a1;\">\"customer\"<\/span>: <span style=\"color: #a6e3a1;\">\"Marco\"<\/span>, <span style=\"color: #a6e3a1;\">\"day\"<\/span>: <span style=\"color: #a6e3a1;\">\"Monday\"<\/span>},\n    {<span style=\"color: #a6e3a1;\">\"item\"<\/span>:<span style=\"color: #a6e3a1;\">\"beer\"<\/span>, <span style=\"color: #a6e3a1;\">\"amount\"<\/span>:<span style=\"color: #fab387;\">7.5<\/span>, <span style=\"color: #a6e3a1;\">\"customer\"<\/span>: <span style=\"color: #a6e3a1;\">\"Marco\"<\/span>, <span style=\"color: #a6e3a1;\">\"day\"<\/span>: <span style=\"color: #a6e3a1;\">\"Monday\"<\/span>},\n    {<span style=\"color: #a6e3a1;\">\"item\"<\/span>:<span style=\"color: #a6e3a1;\">\"coffee\"<\/span>, <span style=\"color: #a6e3a1;\">\"amount\"<\/span>:<span style=\"color: #fab387;\">1.2<\/span>, <span style=\"color: #a6e3a1;\">\"customer\"<\/span>: <span style=\"color: #a6e3a1;\">\"Marco\"<\/span>, <span style=\"color: #a6e3a1;\">\"day\"<\/span>: <span style=\"color: #a6e3a1;\">\"Monday\"<\/span>},\n    {<span style=\"color: #a6e3a1;\">\"item\"<\/span>:<span style=\"color: #a6e3a1;\">\"pizza margherita\"<\/span>, <span style=\"color: #a6e3a1;\">\"amount\"<\/span>:<span style=\"color: #fab387;\">10<\/span>, <span style=\"color: #a6e3a1;\">\"customer\"<\/span>: <span style=\"color: #a6e3a1;\">\"Luca\"<\/span>, <span style=\"color: #a6e3a1;\">\"day\"<\/span>: <span style=\"color: #a6e3a1;\">\"Monday\"<\/span>},\n    {<span style=\"color: #a6e3a1;\">\"item\"<\/span>:<span style=\"color: #a6e3a1;\">\"wine\"<\/span>, <span style=\"color: #a6e3a1;\">\"amount\"<\/span>:<span style=\"color: #fab387;\">10<\/span>, <span style=\"color: #a6e3a1;\">\"customer\"<\/span>: <span style=\"color: #a6e3a1;\">\"Luca\"<\/span>, <span style=\"color: #a6e3a1;\">\"day\"<\/span>: <span style=\"color: #a6e3a1;\">\"Monday\"<\/span>},\n    {<span style=\"color: #a6e3a1;\">\"item\"<\/span>:<span style=\"color: #a6e3a1;\">\"steak\"<\/span>, <span style=\"color: #a6e3a1;\">\"amount\"<\/span>:<span style=\"color: #fab387;\">20<\/span>, <span style=\"color: #a6e3a1;\">\"customer\"<\/span>: <span style=\"color: #a6e3a1;\">\"Marco\"<\/span>, <span style=\"color: #a6e3a1;\">\"day\"<\/span>: <span style=\"color: #a6e3a1;\">\"Tuesday\"<\/span>},\n    {<span style=\"color: #a6e3a1;\">\"item\"<\/span>:<span style=\"color: #a6e3a1;\">\"bottled water\"<\/span>, <span style=\"color: #a6e3a1;\">\"amount\"<\/span>:<span style=\"color: #fab387;\">5<\/span>, <span style=\"color: #a6e3a1;\">\"customer\"<\/span>: <span style=\"color: #a6e3a1;\">\"Marco\"<\/span>, <span style=\"color: #a6e3a1;\">\"day\"<\/span>: <span style=\"color: #a6e3a1;\">\"Tuesday\"<\/span>},\n])\nbill\n<\/pre>\n<\/div>\n\n<table border=\"2\" cellspacing=\"0\" cellpadding=\"6\" rules=\"groups\" frame=\"hsides\">\n\n\n<colgroup>\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-left\" \/>\n<\/colgroup>\n<thead>\n<tr>\n<th scope=\"col\" class=\"org-right\">&#xa0;<\/th>\n<th scope=\"col\" class=\"org-left\">item<\/th>\n<th scope=\"col\" class=\"org-right\">amount<\/th>\n<th scope=\"col\" class=\"org-left\">customer<\/th>\n<th scope=\"col\" class=\"org-left\">day<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-left\">pepperoni pizza<\/td>\n<td class=\"org-right\">12.0<\/td>\n<td class=\"org-left\">Marco<\/td>\n<td class=\"org-left\">Monday<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-left\">beer<\/td>\n<td class=\"org-right\">7.5<\/td>\n<td class=\"org-left\">Marco<\/td>\n<td class=\"org-left\">Monday<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">2<\/td>\n<td class=\"org-left\">coffee<\/td>\n<td class=\"org-right\">1.2<\/td>\n<td class=\"org-left\">Marco<\/td>\n<td class=\"org-left\">Monday<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">3<\/td>\n<td class=\"org-left\">pizza margherita<\/td>\n<td class=\"org-right\">10.0<\/td>\n<td class=\"org-left\">Luca<\/td>\n<td class=\"org-left\">Monday<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">4<\/td>\n<td class=\"org-left\">wine<\/td>\n<td class=\"org-right\">10.0<\/td>\n<td class=\"org-left\">Luca<\/td>\n<td class=\"org-left\">Monday<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">5<\/td>\n<td class=\"org-left\">steak<\/td>\n<td class=\"org-right\">20.0<\/td>\n<td class=\"org-left\">Marco<\/td>\n<td class=\"org-left\">Tuesday<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">6<\/td>\n<td class=\"org-left\">bottled water<\/td>\n<td class=\"org-right\">5.0<\/td>\n<td class=\"org-left\">Marco<\/td>\n<td class=\"org-left\">Tuesday<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n\n<p> <a id=\"org1775039\"><\/a> pandas function <code>pivot_table<\/code> allows to define an aggregation function in case of collision <\/p>\n\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-python\" id=\"nil\"><span style=\"color: #cdd6f4;\">splitted_bill<\/span> <span style=\"color: #89dceb;\">=<\/span> pd.pivot_table(bill,index<span style=\"color: #89dceb;\">=<\/span><span style=\"color: #a6e3a1;\">\"day\"<\/span>,values<span style=\"color: #89dceb;\">=<\/span><span style=\"color: #a6e3a1;\">\"amount\"<\/span>,columns<span style=\"color: #89dceb;\">=<\/span><span style=\"color: #a6e3a1;\">\"customer\"<\/span>,aggfunc<span style=\"color: #89dceb;\">=<\/span><span style=\"color: #a6e3a1;\">\"sum\"<\/span>)\nsplitted_bill\n<\/pre>\n<\/div>\n\n<table border=\"2\" cellspacing=\"0\" cellpadding=\"6\" rules=\"groups\" frame=\"hsides\">\n\n\n<colgroup>\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-right\" \/>\n<\/colgroup>\n<thead>\n<tr>\n<th scope=\"col\" class=\"org-left\">day<\/th>\n<th scope=\"col\" class=\"org-right\">Luca<\/th>\n<th scope=\"col\" class=\"org-right\">Marco<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"org-left\">Monday<\/td>\n<td class=\"org-right\">20.0<\/td>\n<td class=\"org-right\">20.7<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-left\">Tuesday<\/td>\n<td class=\"org-right\">nan<\/td>\n<td class=\"org-right\">25.0<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n\n<p> <a id=\"org65e055a\"><\/a> pandas <code>pd.melt()<\/code> function provides a way to get a &ldquo;long format&rdquo; table <\/p>\n\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-python\" id=\"nil\">pd.melt(splitted_bill)\n<\/pre>\n<\/div>\n\n<table border=\"2\" cellspacing=\"0\" cellpadding=\"6\" rules=\"groups\" frame=\"hsides\">\n\n\n<colgroup>\n<col  class=\"org-right\" \/>\n\n<col  class=\"org-left\" \/>\n\n<col  class=\"org-right\" \/>\n<\/colgroup>\n<thead>\n<tr>\n<th scope=\"col\" class=\"org-right\">&#xa0;<\/th>\n<th scope=\"col\" class=\"org-left\">customer<\/th>\n<th scope=\"col\" class=\"org-right\">value<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"org-right\">0<\/td>\n<td class=\"org-left\">Luca<\/td>\n<td class=\"org-right\">20.0<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">1<\/td>\n<td class=\"org-left\">Luca<\/td>\n<td class=\"org-right\">nan<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">2<\/td>\n<td class=\"org-left\">Marco<\/td>\n<td class=\"org-right\">20.7<\/td>\n<\/tr>\n\n<tr>\n<td class=\"org-right\">3<\/td>\n<td class=\"org-left\">Marco<\/td>\n<td class=\"org-right\">25.0<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div class=\"org-src-container\">\n<label class=\"org-src-name\"><em><\/em><\/label>\n<pre class=\"src src-python\" id=\"nil\">\n<\/pre>\n<\/div>\n\n<table border=\"2\" cellspacing=\"0\" cellpadding=\"6\" rules=\"groups\" frame=\"hsides\">\n\n\n<colgroup>\n<col  class=\"org-left\" \/>\n<\/colgroup>\n<tbody>\n<tr>\n<td class=\"org-left\">&#xa0;<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"We started our space trip to the galaxy of Python Analytics heading onto Jupyter .\nNow it\u2019s time to meet some of the most fascinating inhabitants: the pandas","protected":false},"author":1,"featured_media":593,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"nf_dc_page":"","inline_featured_image":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[4],"tags":[7],"class_list":["post-594","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-language-learning","tag-python"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Meet the Pandas - Noise On The Net<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/noiseonthenet.space\/noise\/2025\/01\/meet-the-pandas\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Meet the Pandas - Noise On The Net\" \/>\n<meta property=\"og:description\" content=\"We started our space trip to the galaxy of Python Analytics heading onto Jupyter . Now it\u2019s time to meet some of the most fascinating inhabitants: the pandas\" \/>\n<meta property=\"og:url\" content=\"https:\/\/noiseonthenet.space\/noise\/2025\/01\/meet-the-pandas\/\" \/>\n<meta property=\"og:site_name\" content=\"Noise On The Net\" \/>\n<meta property=\"article:published_time\" content=\"2025-01-25T15:08:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-01-26T19:27:05+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/noiseonthenet.space\/noise\/wp-content\/uploads\/2025\/01\/thomas-bonometti-OyO5NDiRPMM-unsplash.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"800\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"marco.p.v.vezzoli\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"marco.p.v.vezzoli\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"1 minute\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/noiseonthenet.space\\\/noise\\\/2025\\\/01\\\/meet-the-pandas\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/noiseonthenet.space\\\/noise\\\/2025\\\/01\\\/meet-the-pandas\\\/\"},\"author\":{\"name\":\"marco.p.v.vezzoli\",\"@id\":\"https:\\\/\\\/noiseonthenet.space\\\/noise\\\/#\\\/schema\\\/person\\\/88c3a70f2b23480197bc61d6e1e2e982\"},\"headline\":\"Meet the Pandas\",\"datePublished\":\"2025-01-25T15:08:00+00:00\",\"dateModified\":\"2025-01-26T19:27:05+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/noiseonthenet.space\\\/noise\\\/2025\\\/01\\\/meet-the-pandas\\\/\"},\"wordCount\":1831,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\\\/\\\/noiseonthenet.space\\\/noise\\\/#\\\/schema\\\/person\\\/88c3a70f2b23480197bc61d6e1e2e982\"},\"image\":{\"@id\":\"https:\\\/\\\/noiseonthenet.space\\\/noise\\\/2025\\\/01\\\/meet-the-pandas\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/i0.wp.com\\\/noiseonthenet.space\\\/noise\\\/wp-content\\\/uploads\\\/2025\\\/01\\\/thomas-bonometti-OyO5NDiRPMM-unsplash.jpg?fit=1200%2C800&ssl=1\",\"keywords\":[\"Python\"],\"articleSection\":[\"Language learning\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/noiseonthenet.space\\\/noise\\\/2025\\\/01\\\/meet-the-pandas\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/noiseonthenet.space\\\/noise\\\/2025\\\/01\\\/meet-the-pandas\\\/\",\"url\":\"https:\\\/\\\/noiseonthenet.space\\\/noise\\\/2025\\\/01\\\/meet-the-pandas\\\/\",\"name\":\"Meet the Pandas - Noise On The Net\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/noiseonthenet.space\\\/noise\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/noiseonthenet.space\\\/noise\\\/2025\\\/01\\\/meet-the-pandas\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/noiseonthenet.space\\\/noise\\\/2025\\\/01\\\/meet-the-pandas\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/i0.wp.com\\\/noiseonthenet.space\\\/noise\\\/wp-content\\\/uploads\\\/2025\\\/01\\\/thomas-bonometti-OyO5NDiRPMM-unsplash.jpg?fit=1200%2C800&ssl=1\",\"datePublished\":\"2025-01-25T15:08:00+00:00\",\"dateModified\":\"2025-01-26T19:27:05+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/noiseonthenet.space\\\/noise\\\/2025\\\/01\\\/meet-the-pandas\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/noiseonthenet.space\\\/noise\\\/2025\\\/01\\\/meet-the-pandas\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/noiseonthenet.space\\\/noise\\\/2025\\\/01\\\/meet-the-pandas\\\/#primaryimage\",\"url\":\"https:\\\/\\\/i0.wp.com\\\/noiseonthenet.space\\\/noise\\\/wp-content\\\/uploads\\\/2025\\\/01\\\/thomas-bonometti-OyO5NDiRPMM-unsplash.jpg?fit=1200%2C800&ssl=1\",\"contentUrl\":\"https:\\\/\\\/i0.wp.com\\\/noiseonthenet.space\\\/noise\\\/wp-content\\\/uploads\\\/2025\\\/01\\\/thomas-bonometti-OyO5NDiRPMM-unsplash.jpg?fit=1200%2C800&ssl=1\",\"width\":1200,\"height\":800},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/noiseonthenet.space\\\/noise\\\/2025\\\/01\\\/meet-the-pandas\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/noiseonthenet.space\\\/noise\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Meet the Pandas\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/noiseonthenet.space\\\/noise\\\/#website\",\"url\":\"https:\\\/\\\/noiseonthenet.space\\\/noise\\\/\",\"name\":\"Noise On The Net\",\"description\":\"Sharing adventures in code\",\"publisher\":{\"@id\":\"https:\\\/\\\/noiseonthenet.space\\\/noise\\\/#\\\/schema\\\/person\\\/88c3a70f2b23480197bc61d6e1e2e982\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/noiseonthenet.space\\\/noise\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\\\/\\\/noiseonthenet.space\\\/noise\\\/#\\\/schema\\\/person\\\/88c3a70f2b23480197bc61d6e1e2e982\",\"name\":\"marco.p.v.vezzoli\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/b9d9aab1df560bc14d73b0b442198f196ce39e7c7a38df1dc22fec0b97f17da9?s=96&d=mm&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/b9d9aab1df560bc14d73b0b442198f196ce39e7c7a38df1dc22fec0b97f17da9?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/b9d9aab1df560bc14d73b0b442198f196ce39e7c7a38df1dc22fec0b97f17da9?s=96&d=mm&r=g\",\"caption\":\"marco.p.v.vezzoli\"},\"logo\":{\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/b9d9aab1df560bc14d73b0b442198f196ce39e7c7a38df1dc22fec0b97f17da9?s=96&d=mm&r=g\"},\"description\":\"Self taught assembler programming at 11 on my C64 (1983). Never stopped since then -- always looking up for curious things in the software development, data science and AI. Linux and FOSS user since 1994. MSc in physics in 1996. Working in large semiconductor companies since 1997 (STM, Micron) developing analytics and full stack web infrastructures, microservices, ML solutions\",\"sameAs\":[\"https:\\\/\\\/noiseonthenet.space\\\/noise\\\/\",\"https:\\\/\\\/www.linkedin.com\\\/in\\\/marco-paolo-valerio-vezzoli-0663835\\\/\"],\"url\":\"https:\\\/\\\/noiseonthenet.space\\\/noise\\\/author\\\/marco-p-v-vezzoli\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Meet the Pandas - Noise On The Net","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/noiseonthenet.space\/noise\/2025\/01\/meet-the-pandas\/","og_locale":"en_US","og_type":"article","og_title":"Meet the Pandas - Noise On The Net","og_description":"We started our space trip to the galaxy of Python Analytics heading onto Jupyter . Now it\u2019s time to meet some of the most fascinating inhabitants: the pandas","og_url":"https:\/\/noiseonthenet.space\/noise\/2025\/01\/meet-the-pandas\/","og_site_name":"Noise On The Net","article_published_time":"2025-01-25T15:08:00+00:00","article_modified_time":"2025-01-26T19:27:05+00:00","og_image":[{"width":1200,"height":800,"url":"https:\/\/noiseonthenet.space\/noise\/wp-content\/uploads\/2025\/01\/thomas-bonometti-OyO5NDiRPMM-unsplash.jpg","type":"image\/jpeg"}],"author":"marco.p.v.vezzoli","twitter_card":"summary_large_image","twitter_misc":{"Written by":"marco.p.v.vezzoli","Est. reading time":"1 minute"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/noiseonthenet.space\/noise\/2025\/01\/meet-the-pandas\/#article","isPartOf":{"@id":"https:\/\/noiseonthenet.space\/noise\/2025\/01\/meet-the-pandas\/"},"author":{"name":"marco.p.v.vezzoli","@id":"https:\/\/noiseonthenet.space\/noise\/#\/schema\/person\/88c3a70f2b23480197bc61d6e1e2e982"},"headline":"Meet the Pandas","datePublished":"2025-01-25T15:08:00+00:00","dateModified":"2025-01-26T19:27:05+00:00","mainEntityOfPage":{"@id":"https:\/\/noiseonthenet.space\/noise\/2025\/01\/meet-the-pandas\/"},"wordCount":1831,"commentCount":0,"publisher":{"@id":"https:\/\/noiseonthenet.space\/noise\/#\/schema\/person\/88c3a70f2b23480197bc61d6e1e2e982"},"image":{"@id":"https:\/\/noiseonthenet.space\/noise\/2025\/01\/meet-the-pandas\/#primaryimage"},"thumbnailUrl":"https:\/\/i0.wp.com\/noiseonthenet.space\/noise\/wp-content\/uploads\/2025\/01\/thomas-bonometti-OyO5NDiRPMM-unsplash.jpg?fit=1200%2C800&ssl=1","keywords":["Python"],"articleSection":["Language learning"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/noiseonthenet.space\/noise\/2025\/01\/meet-the-pandas\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/noiseonthenet.space\/noise\/2025\/01\/meet-the-pandas\/","url":"https:\/\/noiseonthenet.space\/noise\/2025\/01\/meet-the-pandas\/","name":"Meet the Pandas - Noise On The Net","isPartOf":{"@id":"https:\/\/noiseonthenet.space\/noise\/#website"},"primaryImageOfPage":{"@id":"https:\/\/noiseonthenet.space\/noise\/2025\/01\/meet-the-pandas\/#primaryimage"},"image":{"@id":"https:\/\/noiseonthenet.space\/noise\/2025\/01\/meet-the-pandas\/#primaryimage"},"thumbnailUrl":"https:\/\/i0.wp.com\/noiseonthenet.space\/noise\/wp-content\/uploads\/2025\/01\/thomas-bonometti-OyO5NDiRPMM-unsplash.jpg?fit=1200%2C800&ssl=1","datePublished":"2025-01-25T15:08:00+00:00","dateModified":"2025-01-26T19:27:05+00:00","breadcrumb":{"@id":"https:\/\/noiseonthenet.space\/noise\/2025\/01\/meet-the-pandas\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/noiseonthenet.space\/noise\/2025\/01\/meet-the-pandas\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/noiseonthenet.space\/noise\/2025\/01\/meet-the-pandas\/#primaryimage","url":"https:\/\/i0.wp.com\/noiseonthenet.space\/noise\/wp-content\/uploads\/2025\/01\/thomas-bonometti-OyO5NDiRPMM-unsplash.jpg?fit=1200%2C800&ssl=1","contentUrl":"https:\/\/i0.wp.com\/noiseonthenet.space\/noise\/wp-content\/uploads\/2025\/01\/thomas-bonometti-OyO5NDiRPMM-unsplash.jpg?fit=1200%2C800&ssl=1","width":1200,"height":800},{"@type":"BreadcrumbList","@id":"https:\/\/noiseonthenet.space\/noise\/2025\/01\/meet-the-pandas\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/noiseonthenet.space\/noise\/"},{"@type":"ListItem","position":2,"name":"Meet the Pandas"}]},{"@type":"WebSite","@id":"https:\/\/noiseonthenet.space\/noise\/#website","url":"https:\/\/noiseonthenet.space\/noise\/","name":"Noise On The Net","description":"Sharing adventures in code","publisher":{"@id":"https:\/\/noiseonthenet.space\/noise\/#\/schema\/person\/88c3a70f2b23480197bc61d6e1e2e982"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/noiseonthenet.space\/noise\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":["Person","Organization"],"@id":"https:\/\/noiseonthenet.space\/noise\/#\/schema\/person\/88c3a70f2b23480197bc61d6e1e2e982","name":"marco.p.v.vezzoli","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/b9d9aab1df560bc14d73b0b442198f196ce39e7c7a38df1dc22fec0b97f17da9?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/b9d9aab1df560bc14d73b0b442198f196ce39e7c7a38df1dc22fec0b97f17da9?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/b9d9aab1df560bc14d73b0b442198f196ce39e7c7a38df1dc22fec0b97f17da9?s=96&d=mm&r=g","caption":"marco.p.v.vezzoli"},"logo":{"@id":"https:\/\/secure.gravatar.com\/avatar\/b9d9aab1df560bc14d73b0b442198f196ce39e7c7a38df1dc22fec0b97f17da9?s=96&d=mm&r=g"},"description":"Self taught assembler programming at 11 on my C64 (1983). Never stopped since then -- always looking up for curious things in the software development, data science and AI. Linux and FOSS user since 1994. MSc in physics in 1996. Working in large semiconductor companies since 1997 (STM, Micron) developing analytics and full stack web infrastructures, microservices, ML solutions","sameAs":["https:\/\/noiseonthenet.space\/noise\/","https:\/\/www.linkedin.com\/in\/marco-paolo-valerio-vezzoli-0663835\/"],"url":"https:\/\/noiseonthenet.space\/noise\/author\/marco-p-v-vezzoli\/"}]}},"jetpack_featured_media_url":"https:\/\/i0.wp.com\/noiseonthenet.space\/noise\/wp-content\/uploads\/2025\/01\/thomas-bonometti-OyO5NDiRPMM-unsplash.jpg?fit=1200%2C800&ssl=1","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/pdDUZ5-9A","jetpack-related-posts":[],"jetpack_likes_enabled":true,"_links":{"self":[{"href":"https:\/\/noiseonthenet.space\/noise\/wp-json\/wp\/v2\/posts\/594","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/noiseonthenet.space\/noise\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/noiseonthenet.space\/noise\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/noiseonthenet.space\/noise\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/noiseonthenet.space\/noise\/wp-json\/wp\/v2\/comments?post=594"}],"version-history":[{"count":5,"href":"https:\/\/noiseonthenet.space\/noise\/wp-json\/wp\/v2\/posts\/594\/revisions"}],"predecessor-version":[{"id":625,"href":"https:\/\/noiseonthenet.space\/noise\/wp-json\/wp\/v2\/posts\/594\/revisions\/625"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/noiseonthenet.space\/noise\/wp-json\/wp\/v2\/media\/593"}],"wp:attachment":[{"href":"https:\/\/noiseonthenet.space\/noise\/wp-json\/wp\/v2\/media?parent=594"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/noiseonthenet.space\/noise\/wp-json\/wp\/v2\/categories?post=594"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/noiseonthenet.space\/noise\/wp-json\/wp\/v2\/tags?post=594"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}