Cricket query help

Back to cricket query

Basics

This is an interface to an SQLite database. The database contains a scrape of Cricinfo's Statsguru batting, bowling and team tabs. It allows running (read-only) SQL queries directly, and has some saved queries:

The scraped data comes from Owen Brasier's cricketstats project. The source for this project is also public.

Other sections on this page are schema, annoyances, result formatting, and latest data.

Schema

This is a pretty direct interface to an SQLite database. The database contains a scrape of Cricinfo's Statsguru batting, bowling and team tabs in the following tables, where $gender can be men or women and $format can be test, odi, or t20i.

Table aliases

As the table names are very long, and for many purposes we will want to separate the results by gender and format anyway (comparing a T20I average to a Test average is not often useful), these tables have shorter names available when choosing genders and formats with the checkboxes:

This will allow a separate results table for each gender and format chosen. If we use the full table names (for instance, women_test_bowling_innings) directly, then the results shown for all genders and formats will be the same.

Batting tables

Column name Type Comment
i integer Internal ID, not particularly useful
player text
team text
runs integer
runs_txt text Score, suffixed with * if not out
not_out boolean See boolean columns
mins numeric
bf numeric
fours numeric
sixes numeric
sr numeric
pos integer
innings integer
opposition text
ground text
start_date date See date columns
player_id text See ID columns
match_id text See ID columns

Bowling tables

Column name Type Comment
i integer Internal ID, not particularly useful
player text
team text
overs text May contain a dot; for instance, 4.1
maidens numeric
runs numeric
wickets numeric
bpo integer
balls numeric
economy numeric
pos integer
innings integer
opposition text
ground text
start_date date See date columns
player_id text See ID columns
match_id text See ID columns

Team tables

Column name Type Comment
i integer Internal ID, not particularly useful
team text
score text
runs integer
overs numeric
bpo integer
rpo numeric
lead integer
all_out boolean See boolean columns
declared boolean See boolean columns
result text
innings integer
opposition text
ground text
start_date date See date columns
match_id text See ID columns

Annoyances

There are several limitations in this database, largely due to a loss of fidelity when scraping and the choice to use a loosely-typed database. These are the most annoying of those.

Boolean columns

Boolean columns (like innings.not_out and team_innings.all_out) are not real booleans, as SQLite doesn't support them. Instead, they contain the strings 'True' and 'False', so compare directly against those values.

Date columns

Similarly, the start_date columns are not real dates. For prettier formatting, they are displayed in a format like this: 2 June 2021.

However, internally they are represented as ISO date strings: 2021-06-02. To find matches on that date, we can use start_date = '2021-06-02'.

Results limit

Hopefully this is less annoying, but I have chosen to limit the number of rows in each table to 100, no matter how many rows were returned. If this becomes a problem I can increase it.

Result formatting

ID columns

Player and match ID columns have special behaviour. A player ID is p followed by digits (with no other characters in the string), and a match ID is the same but with an m prefix.

When a player or match ID is detected (either from the original value, or constructed if you really want to), it will be linked to the relevant Cricinfo player profile or scorecard page.

Other result formatting

In addition to the automatic formatting for date columns and ID columns, numeric columns will be formatted with a thousands separator, and if they contain a non-integer, will be shown to two decimal places.

To force a result column to be displayed literally, prepend an apostrophe - ' - this will be stripped from the output and the rest will be presented verbatim. For example: strftime('%Y', start_date) will display as 2,001, while "'" || strftime('%Y', start_date) will display as 2001.

Latest data

The database is updated daily, although there will be a delay when new teams play their first international match. The latest match for each format is:

gender format team opposition ground start_date match_id
men test South Africa Australia Sydney 4 January 2023 m1317494
women test South Africa England Taunton 27 June 2022 m1301327
men odi South Africa England Bloemfontein 29 January 2023 m1339596
women odi Pakistan Australia Sydney 21 January 2023 m1345094
men t20i India New Zealand Lucknow 29 January 2023 m1348650
women t20i India South Africa East London 28 January 2023 m1344515

By Sean McGivern