I've got a few suggestions based on what I've learned over the past 15 years or so. However I'd say wait and see what I suggest, somethings you might want to reject.bobby s wrote:1) Have we decided on a default format?
2) Are we starting with the period 1890-1939 first?
If I can see a default format, I'll amend my spreadsheet to suit.
The data split into the following sections
Clubs
related to each team the 'manager list' with start and end dates
the 'venue list' with start and end dates ,city, country
Games
Uniquely identified by
the 'date' in format CCYYMMDD ,
the 'home' or 'away' team
a sequence number to cater for 2 or more games played on a single day
then extra details about the competition, round etc though this doesnt have to be defined in the details of the Team Players per match
Each 'game' would produce 2 entries in the 'Team Players per match' file
1 for the 'home' team file
1 for the 'away' team file
This for Rangers and Hearts first league game
From a Rangers Perspective
In their file
1890 08 16 Heart Of Midlothian L1 H 5 2 Half Time 1-2 etc .......
In the Hearts file it appears 'reversed'
1890 08 16 Rangers L1 A 2 5 Half Time 2-1 etc .......
Master Players List
Each player uniquely Identified. John Lister's disc would be a great starting point if we can get permission to use it. Each player should ultimately be uniquely identified.
Master Players per team per season List
Each unique player assigned to a taem and the seasons (and or start end dates) they were there with the various 'shortened' versions of their names for matching to the 'Team Players per match'
so 'Joseph Henry Baker' '1957-58' 'Hibernian' can be matched against an entry of 'Baker' playing for Hibernain in 1957-58. Though for later seasons it might have to be 'Baker J' to distinguish from 'Gerry Baker'
Team Players per match
For each match the players who played. Originally I just used a single 'cell' per player with substitutes names and times in brackets
However I now record shown in bold. The additional columns are things we might like to add.
Even though substitues wern't generally allowed until the mid 1960s. They did appear in the odd game prior to that. Using the same format for all the years does make it easier to load in.
Name of player 1 <-- as recorded in the source
Unique name of player 1 <-- the unique name for that player. I can add this in automatically in most cases once we have a file of 'unique player names against team and season'
Name of Club of player 1 <-- Possibly Overkill but it caters for 'select teams' , guest players from other clubs etc
Name of substitute for player 1
Unique Name of substitute for player 1
Name of Club of substitute for player 1
Time of substitute for player 1 if known
Name of player 2
Unique name of player 2
Name of Club of player 2
Name of substitute for player 2
Unique Name of substitute for player 2
Name of Club of substitute for player 2
Time of substitute for player 2
Name of player 3
Unique name of player 3
Name of Club of player 3
Name of substitute for player 3
Unique Name of substitute for player 3
Name of Club of substitute for player 3
Time of substitute for player 3
etc
Where a player is know to have come on a as a sub but not know who for I record them as substitute 1
Players not used are recorded as unused subs
I'll post up some examples of the different files and how we can use each file to enhance the data in an other set of data.