eBird is a catalog of 'checklists' for bird species observed globally by novice and experienced bird observers reporting to the eBird Citizen Science Project run by the Cornell Lab of Ornithology and the National Audubon Society. 'Checklists' include presence/absence, species count, and location information for a single obervation event. Four datasets are available on-line in various groupings and subsets of the expansive catalog:

- The Observation Dataset is a catalog of species counts (>150,000,000) with date and location. The dataset is updated annually.

-The Reference Data includes two independent catalogs of 'checklists' for the Western Hemisphere and lower 48 United States. Included with the 'checklist' species data are social and environmental correlates or 'predictor variables'.

-The Stratified Random Design Data is a companion dataset to the Reference Data. It includes three independent subsets of lat/long coordinates and 'predictor variables' at various spatial extent and resolution. The lat/long coordinates are chosen for a representative and constant spactial extent useful for statistical analysis. This data is intended for use with ecological models to gererate maps and predicted surfaces. There is no species count data in these data tables.

- The eBird Basic Dataset is the complete catalog of 'checklists' worldwide. It is located behind a log-in screen and is extremely large (i.e., 4 GB text file with >100,000,000 records).

Dataset Location:DataONE: eBird Observation Dataset & eBird.org: eBird Reference and Stratified Random Design Datasets; eBird Basic Dataset (behind log-in screen)

Metadata: Paper describing the Ebird Reference Dataset

Site Location: eBird logs recordings of bird observations worldwide as cataloged in the Basic Dataset and the Observation Dataset. The Reference and Random Stratified datasets include environmental and social covariates for the Western Hemisphere and the Continental US 48 spatial extents. The three datasets included as components of the Random Stratified Dataset differ in spatial extent and resolution drastically varying the file sizes: lower 48 US at 30km and 3km; Western Hemisphere at 1.5km.
Site(s) Georeferenced: Yes

Timespan: 2002 to 2013

Sampling Frequency: Irregular, but frequently

 Data collection summary/Methods:  Data is collected by volunteers through a reporting 'checklist'. Species observations (visual or aural) are recorded with an estimate of count. Sampling event details and checklist type are also reported. A 'casual count' checklist is observations made while birding was not the submitter’s primary activity. A 'random count' checklist is observations made at a randomly selected location over a period of at least five minutes. A checklist is marked 'complete' if the observer reported all the species seen and heard to the best of his or her ability. 'Complete' does not require that all species in the area where observed or accounted for. eBird documentation suggests that a 'complete' checklist allows analysis of presence and absence of species. 

Data collected: species ID, species count, location, distance traveled, area covered, sampling effort, no. of observers, checklist type, environmental and social covariate data

Known Issues: "Excel is unable to handle the larger data files in this dataset. The data from year 2008 contains more records than Excel supports (rows were truncated around 175,000); previous years contain fewer records. In our experiments with Excel, columns were truncated from the US48 checklist files and the extended covariates files." (Munson et al, 2013)


Best Practices:

For Observation Dataset:

- This dataset is the only one of the four listed that is not organized in a cross-tabular form.

- It is a singal extremely large (42GB) .csv file once unzipped.

- The data is temporally and spatially expansive, but is limited in that it only includes a species name, location, and count. There is no absece records in this data.


For Reference Data:

- Missing / unknown variable = '?', with correspoding covariate measurement error = NA.

- Species count reported as present without a count = 'X'.

- Group submissions replicate checklists present in the data set. Use PRIMARY_CHECKLIST_FLAG for set of unique checklists.

Conditions for use: "The eBird reference data is freely available for all usages." (Munson et al, 2013)

Citation/ Distributing Author: M. Arthur Munson, Kevin Webb, Daniel Sheldon, Daniel Fink, Wesley M. Hochachka, Marshall Iliff, Mirek Riedewald, Daria Sorokina, Brian Sullivan, Christopher Wood, and Steve Kelling. The eBird Reference Dataset, Version 5.0. Cornell Lab of Ornithology and National Audubon Society, Ithaca, NY, January 2013.

Supplemental resources: Data exploration on-line: http://ebird.org/ebird/eBirdReports?cmd=Start

Pulications from eBird users: http://ebird.org/content/ebird/about/publications/

 Available via the EcoData Retriever: Yes or No

