Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu

Statistics

What do you track?

We track two types of events:

  1. Visits to a record page.
  2. Downloads of a file.

For both types of events, we track:

  1. Visitor: An anonymized visitor ID.
  2. Visitor type: If the request was made by a) human, b) machine or c) robot.
  3. Country: The country of origin of the request (based on the IP address).
  4. Referrer: The referrer domain.

What is a view ?

A user (human or machine) visiting a record, excluding double-clicks and robots.

What is a unique view ?

A unique view is defined as one or more visits by a user within a 1-hour time-window. This means that if the same record was accessed multiple times by the same user within the same time-window, we consider it as one unique view.

What is a download ?

A user (human or machine) downloading a file from a record, excluding double-clicks and robots. If a record has multiple files and you download all files, each file counts as one download.

What is a unique download ?

A unique download is defined as one or more file downloads from files of a single record by a user within a 1-hour time-window. This means that if one or more files of the same record were downloaded multiple times by the same user within the same time-window, we consider it as one unique download.

What is downloaded data volume ?

The total data volume that has been downloaded for all files in a record by a user (human or machine), excluding double-clicks and robots. In case a user cancels a file download mid-way, we still count the total file size as fully downloaded.

How do you deal with versions?

By default, for a record, we display the aggregated counts of views, downloads and data volume for all versions of a record. You can further expand the usage statistics box on a record page to see the counts for the specific version.

How do you deal with robots?

Requests made by robots (aka crawlers, spiders, bots) are filtered out from the usage statistics. We detect robots based on a standardized list of robots provided by the COUNTER and Making Data Count projects.

How often do you update usage statistics?

Once a day.

How can I see the most viewed records?

Any search on CaltechAUTHORS allows you to sort the search results by " most viewed ".

How do you track?

We comply with the COUNTER Code of Practice as well as the Code of Practice for Research Data Usage Metrics in our tracking. This means that our usage statistics can be compared with other COUNTER-compliant repositories.

What is the difference between a machine and a robot?

A machine request is an automated request initiated by a human user, e.g. a script downloading data from CaltechAUTHORS and running an analysis on the data. A robot request is an automated request made by e.g. a search engine crawler.

How do you anonymize users?

For each view/download event, we track an anonymized visitor ID. This anonymized visitor ID changes for a user every 24 hours, hence a user viewing the same record on two different days will have two different anonymized visitor IDs. The reason we track an anonymized visitor ID is in order to count unique views and downloads.

For security purposes, we also keep a web server access log which includes your IP address and your browser’s user agent string. This web server access log is automatically deleted after maximum 1 year and is also strictly separated from the usage statistics collection.

The anonymized visitor ID is generated from a personal identifier such as:

  1. a user ID (e.g. if you are logged in on zenodo.org),
  2. a session ID,
  3. or an IP address and your browser’s user-agent string.

We combine the personal identifier with a random text value (a salt) and apply a one-way cryptographic hash function to scramble the data. The salt (random text value) is thrown away and regenerated every 24 hours. Using and afterwards throwing away the random salt, ensures that the anonymized visitor ID is fully random.

Can I opt-out of the usage statistics tracking?

No, it is not possible to opt-out. The usage statistics tracking is fully anonymized and is done on the server-side.

Do you support usage statistics for a community?

Not yet, but we will be adding aggregated usage statistics for your communities.

Do you track my search queries?

No.

Do you do any manual or automatic profiling of users?

No.