Reference

Text analysis glossary

Application Programming Interface (API) A protocol that defines communication between a client and server, often used to request data. APIs can help retrieve data from remote repositories, anything from weather to Twitter and Facebook. Argument (in Python) An input that is passed into a function. For example, print('Hello World')

Constellate file types

Metadata and n-gram Files The dataset builder creates three kinds of files: A CSV file containing only metadata A CSV file containing only unigrams or bigrams or trigrams. A JSON Lines file containing metadata and the textual data The textual data includes: Unigrams Bigrams Trigrams Full Text (where available) The

Constellate data providers

The Constellate Dataset Builder features content from each of the providers below. (Some providers also contain overlapping content.)  All content in Constellate is available to you for analysis, regardless of whether your institution subscribes to the content for access. You may build visualizations by publisher for any of the content

Constellate backend technologies

Constellate uses the following software: Ghost -- a publishing platform that hosts our help pagesAWS -- our platform is built in the cloud and is currently running in Amazon Web Services.BinderHub -- the web application behind our Analytics LabJupyter -- our code tutorials are all written in Jupyter NotebooksElasticsearch

Dataset options

Note that there are two kinds of content in Constellate.  Some content is 'open' and the full-text of open content is included in any JSONL datasets you download.  Some content is 'rights-restricted' and Constellate cannot include the full-text in the JSONL datasets you download through the application directly.  Read on

Developer quickstart

We have released a "quickstart" guide for experienced developers who want to learn about the Constellate platform.

Constellate search syntax

Constellate provides metadata and full-text search across the 30+ million documents we have aggregated. Use the search filters on the left hand side of the builder screen to filter results. Keyword search operatorsThe keyword query field accepts several operators: + or AND:  signifies AND operation| or OR: signifies OR operation- negates

Constellate client

The Constellate tutorial notebooks include examples that use a dataset identifier to retrieve data from Constellate's backend systems and make it available in the notebook environment for analysis. This is accomplished through a client library we have developed and make available, e.g. import constellate. This client is intentionally kept

All about Constellate visualizations

Constellate has a growing number of visualizations available as you build and after you have built datasets.  We will continue to add to the visualizations and welcome suggestions! Before the quick notes on some of the current visualizations below, a hint: You may download and share these visualizations from those

User quickstart

So, someone told you to check out Constellate? Excellent.

What to expect when launching a notebook

When you open or launch a Jupyter Notebook in the Constellate lab, some fun stuff happens in the background. With the first Notebook you open, we create a notebook session just for you.  This usually happens in just a couple of minutes and can take up to 4 minutes if

Documentation Categories

The documentation for Constellate and the educational notebooks are organized according to a four-part structure inspired by Daniele Procida's presentation at Pycon Australia 2017. Where possible, we have divided the documentation into four categories: In brief, these four categories can be defined as follows: Tutorial (Learning-oriented) A carefully constructed example

Join the community

Join our email list for information about new content, lessons, features, and webinars.

You've successfully subscribed to Constellate
Great! Next, complete checkout for full access to Constellate
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.