Jfjelstul Worldcup Data-csv Appearances May 2026

For the analyst, this file is a playground of temporal logic. For the fan, it is a reminder that every minute on that pitch is a dataset of one. Load the CSV. Run the join. Ask who really worked the hardest. The answer is waiting in the rows of appearances.csv .

Calculate the average minute of the first substitution per decade. jfjelstul worldcup data-csv appearances

This is the story of the appearances.csv file—a relational goldmine that turns abstract match results into tangible human performance. Before we dive into queries, we must understand the granularity. In the jfjelstul/worldcup model, appearances.csv is a fact table linking players to matches. It contains approximately 4,000+ rows (depending on the latest update), covering every World Cup from 1930 to 2022. For the analyst, this file is a playground of temporal logic

Using the appearances table, you must calculate time_played = (substitute_out - substitute_in) for each row. For players who played the full 90 (or 120), the logic is different. Run the join

import pandas as pd appearances = pd.read_csv('https://raw.githubusercontent.com/jfjelstul/worldcup/master/data-csv/appearances.csv') goals = pd.read_csv('https://raw.githubusercontent.com/jfjelstul/worldcup/master/data-csv/goals.csv') Filter for substitutes (game_started = FALSE) subs = appearances[appearances['game_started'] == False] Merge with goals to count goals by sub appearances sub_goals = goals.merge(subs, on=['match_id', 'player_id']) sub_goals_count = sub_goals.groupby('player_name_x').size().reset_index(name='goals') sub_goals_count.sort_values('goals', ascending=False).head(10)

SELECT player_name, team, SUM(minutes_played) as total_minutes FROM appearances WHERE tournament = '2022' GROUP BY player_id ORDER BY total_minutes DESC Goalkeepers and center-backs from finalists dominate. In 2022, Emiliano Martínez (Argentina) or Hugo Lloris (France) would top the list with ~690+ minutes. But the real magic is historical: In 2014, Manuel Neuer played every single minute of Germany’s run, including the final. 3. The Tactical Insight: Substitution Dynamics Over Time The substitute_in and substitute_out columns allow you to map the evolution of tactics. Before 1970, substitutions were practically non-existent (injury only). By 2022, five substitutions were allowed.

For the analyst, this file is a playground of temporal logic. For the fan, it is a reminder that every minute on that pitch is a dataset of one. Load the CSV. Run the join. Ask who really worked the hardest. The answer is waiting in the rows of appearances.csv .

Calculate the average minute of the first substitution per decade.

This is the story of the appearances.csv file—a relational goldmine that turns abstract match results into tangible human performance. Before we dive into queries, we must understand the granularity. In the jfjelstul/worldcup model, appearances.csv is a fact table linking players to matches. It contains approximately 4,000+ rows (depending on the latest update), covering every World Cup from 1930 to 2022.

Using the appearances table, you must calculate time_played = (substitute_out - substitute_in) for each row. For players who played the full 90 (or 120), the logic is different.

import pandas as pd appearances = pd.read_csv('https://raw.githubusercontent.com/jfjelstul/worldcup/master/data-csv/appearances.csv') goals = pd.read_csv('https://raw.githubusercontent.com/jfjelstul/worldcup/master/data-csv/goals.csv') Filter for substitutes (game_started = FALSE) subs = appearances[appearances['game_started'] == False] Merge with goals to count goals by sub appearances sub_goals = goals.merge(subs, on=['match_id', 'player_id']) sub_goals_count = sub_goals.groupby('player_name_x').size().reset_index(name='goals') sub_goals_count.sort_values('goals', ascending=False).head(10)

SELECT player_name, team, SUM(minutes_played) as total_minutes FROM appearances WHERE tournament = '2022' GROUP BY player_id ORDER BY total_minutes DESC Goalkeepers and center-backs from finalists dominate. In 2022, Emiliano Martínez (Argentina) or Hugo Lloris (France) would top the list with ~690+ minutes. But the real magic is historical: In 2014, Manuel Neuer played every single minute of Germany’s run, including the final. 3. The Tactical Insight: Substitution Dynamics Over Time The substitute_in and substitute_out columns allow you to map the evolution of tactics. Before 1970, substitutions were practically non-existent (injury only). By 2022, five substitutions were allowed.