|
29 | 29 | "cell_type": "code",
|
30 | 30 | "execution_count": null,
|
31 | 31 | "id": "b0379359-694d-491c-ba18-fa80ea3c65bc",
|
32 |
| - "metadata": {}, |
| 32 | + "metadata": { |
| 33 | + "tags": [] |
| 34 | + }, |
33 | 35 | "outputs": [],
|
34 | 36 | "source": [
|
35 | 37 | "# Import required libraries\n",
|
|
56 | 58 | "id": "a639eb32-15e0-48fc-9cb5-3bd804fae0f9",
|
57 | 59 | "metadata": {},
|
58 | 60 | "source": [
|
59 |
| - "## Establishing the Connection" |
| 61 | + "## Handling API Keys" |
60 | 62 | ]
|
61 | 63 | },
|
62 | 64 | {
|
63 | 65 | "cell_type": "markdown",
|
64 | 66 | "id": "da084101-19bc-46c1-9722-317d0514da5e",
|
65 | 67 | "metadata": {},
|
66 | 68 | "source": [
|
67 |
| - "Put the API Key you just created in the `api_key` variable in the cell below:" |
| 69 | + "API keys are sensitive data! You **do not** want to accidentally check them into a publically shared GitHub repo.\n", |
| 70 | + "\n", |
| 71 | + "The following cell will:\n", |
| 72 | + "\n", |
| 73 | + "1. first try to obtain previously saved credentials by loading with `configparser`;\n", |
| 74 | + "2. if not found, use `getpass` to request the credentials from the user (which works in notebooks as an input prompt);\n", |
| 75 | + "3. then save those user-inputted credentials using configparser to `~/.notebook-api-keys` which is outside of the .git controlled directory so it doesn't accidentally get added and checked in.\n", |
| 76 | + "\n", |
| 77 | + "Run the following cell and add the API Key you just created when prompted." |
68 | 78 | ]
|
69 | 79 | },
|
70 | 80 | {
|
71 | 81 | "cell_type": "code",
|
72 |
| - "execution_count": null, |
73 |
| - "id": "5fbdc8b6-a91d-4f07-a98c-2ba3e8b01726", |
74 |
| - "metadata": {}, |
| 82 | + "execution_count": 12, |
| 83 | + "id": "17a9823a-4e85-41ff-b321-4f9821359821", |
| 84 | + "metadata": { |
| 85 | + "tags": [] |
| 86 | + }, |
75 | 87 | "outputs": [],
|
76 | 88 | "source": [
|
77 |
| - "# Put your API key here\n", |
78 |
| - "api_key = \"\"" |
| 89 | + "import configparser\n", |
| 90 | + "import os\n", |
| 91 | + "from getpass import getpass\n", |
| 92 | + "\n", |
| 93 | + "def get_api_key(api_name):\n", |
| 94 | + " config_file_path = os.path.expanduser(\"~/.notebook-api-keys\")\n", |
| 95 | + " config = configparser.ConfigParser()\n", |
| 96 | + " \n", |
| 97 | + " # Try reading the existing config file\n", |
| 98 | + " if os.path.exists(config_file_path):\n", |
| 99 | + " config.read(config_file_path)\n", |
| 100 | + " \n", |
| 101 | + " # Check if API key is present\n", |
| 102 | + " if config.has_option(\"API_KEYS\", api_name):\n", |
| 103 | + " return config.get(\"API_KEYS\", api_name)\n", |
| 104 | + " \n", |
| 105 | + " # If not, prompt the user for the API key\n", |
| 106 | + " api_key = getpass(f\"Enter your {api_name} API key: \")\n", |
| 107 | + " \n", |
| 108 | + " # Save the API key in the config file\n", |
| 109 | + " if not config.has_section(\"API_KEYS\"):\n", |
| 110 | + " config.add_section(\"API_KEYS\")\n", |
| 111 | + " config.set(\"API_KEYS\", api_name, api_key)\n", |
| 112 | + " \n", |
| 113 | + " with open(config_file_path, \"w\") as f:\n", |
| 114 | + " config.write(f)\n", |
| 115 | + " \n", |
| 116 | + " return api_key\n", |
| 117 | + "\n", |
| 118 | + "# Example usage\n", |
| 119 | + "api_key = get_api_key(\"NYT\")" |
79 | 120 | ]
|
80 | 121 | },
|
81 | 122 | {
|
82 | 123 | "cell_type": "markdown",
|
83 |
| - "id": "c8f60b12-cced-49b4-9627-2782a75a8429", |
| 124 | + "id": "fae52a0a-8e33-4f78-b181-e254fa81eabe", |
84 | 125 | "metadata": {},
|
85 | 126 | "source": [
|
86 |
| - "You can save your API Key locally so that you don't always have to copy it. Just be sure not to put this file on Github, or else your API key may be compromised!" |
87 |
| - ] |
88 |
| - }, |
89 |
| - { |
90 |
| - "cell_type": "code", |
91 |
| - "execution_count": null, |
92 |
| - "id": "a62d2d57", |
93 |
| - "metadata": {}, |
94 |
| - "outputs": [], |
95 |
| - "source": [ |
96 |
| - "# Save your key locally\n", |
97 |
| - "with open(\"nyt_api_key.txt\", \"w\") as f:\n", |
98 |
| - " f.write(api_key)" |
99 |
| - ] |
100 |
| - }, |
101 |
| - { |
102 |
| - "cell_type": "code", |
103 |
| - "execution_count": null, |
104 |
| - "id": "a07ed22e-948c-4b21-990e-50c1a8e8238c", |
105 |
| - "metadata": {}, |
106 |
| - "outputs": [], |
107 |
| - "source": [ |
108 |
| - "# Read your key locally\n", |
109 |
| - "with open(\"nyt_api_key.txt\", \"r\") as f:\n", |
110 |
| - " api_key = f.read()" |
| 127 | + "💡 **Tip**: Another way to keep your credentials secure and provide convenient access is through the [JupyterLab Credential Store\n", |
| 128 | + "](https://towardsdatascience.com/the-jupyterlab-credential-store-9cc3a0b9356). If you are using JupyterLab, this is a great general solution for handling API keys!" |
111 | 129 | ]
|
112 | 130 | },
|
113 | 131 | {
|
114 | 132 | "cell_type": "markdown",
|
115 |
| - "id": "582ed0ad-ccad-4294-a50f-e1d4843217c6", |
| 133 | + "id": "5f5b395a-443f-4e40-91f7-8fcb5c33bbbe", |
116 | 134 | "metadata": {},
|
117 | 135 | "source": [
|
| 136 | + "## Using `pynytimes`\n", |
| 137 | + "\n", |
118 | 138 | "To access the NYTimes' databases, we'll be using a third-party library called [pynytimes](https://github.com/michadenheijer/pynytimes). This package provides an easy to use tool for accessing the wealth of data hosted by the Times.\n",
|
119 | 139 | "\n",
|
120 |
| - "To install the library, follow the instructions taken from their [Github repo](https://github.com/michadenheijer/pynytimes)." |
121 |
| - ] |
122 |
| - }, |
123 |
| - { |
124 |
| - "cell_type": "markdown", |
125 |
| - "id": "5f5b395a-443f-4e40-91f7-8fcb5c33bbbe", |
126 |
| - "metadata": {}, |
127 |
| - "source": [ |
128 |
| - "## Installation\n", |
| 140 | + "To install the library, follow the instructions taken from their [Github repo](https://github.com/michadenheijer/pynytimes).\n", |
129 | 141 | "\n",
|
130 | 142 | "There are multiple options to install `pynytimes`, but the easiest is by just installing it using `pip` in the Jupyter notebook itself, using a magic command:"
|
131 | 143 | ]
|
132 | 144 | },
|
133 | 145 | {
|
134 | 146 | "cell_type": "code",
|
135 |
| - "execution_count": null, |
| 147 | + "execution_count": 8, |
136 | 148 | "id": "d1d80b96-2285-43ab-bbd6-090fd4a9c2d7",
|
137 |
| - "metadata": {}, |
138 |
| - "outputs": [], |
| 149 | + "metadata": { |
| 150 | + "tags": [] |
| 151 | + }, |
| 152 | + "outputs": [ |
| 153 | + { |
| 154 | + "name": "stdout", |
| 155 | + "output_type": "stream", |
| 156 | + "text": [ |
| 157 | + "Collecting pynytimes\n", |
| 158 | + " Downloading pynytimes-0.10.0-py3-none-any.whl (20 kB)\n", |
| 159 | + "Requirement already satisfied: requests<3.0.0,>=2.10.0 in /Users/tomvannuenen/anaconda3/envs/dlab/lib/python3.10/site-packages (from pynytimes) (2.31.0)\n", |
| 160 | + "Requirement already satisfied: urllib3 in /Users/tomvannuenen/anaconda3/envs/dlab/lib/python3.10/site-packages (from pynytimes) (2.0.4)\n", |
| 161 | + "Requirement already satisfied: charset-normalizer<4,>=2 in /Users/tomvannuenen/anaconda3/envs/dlab/lib/python3.10/site-packages (from requests<3.0.0,>=2.10.0->pynytimes) (3.2.0)\n", |
| 162 | + "Requirement already satisfied: idna<4,>=2.5 in /Users/tomvannuenen/anaconda3/envs/dlab/lib/python3.10/site-packages (from requests<3.0.0,>=2.10.0->pynytimes) (3.4)\n", |
| 163 | + "Requirement already satisfied: certifi>=2017.4.17 in /Users/tomvannuenen/anaconda3/envs/dlab/lib/python3.10/site-packages (from requests<3.0.0,>=2.10.0->pynytimes) (2023.7.22)\n", |
| 164 | + "Installing collected packages: pynytimes\n", |
| 165 | + "Successfully installed pynytimes-0.10.0\n", |
| 166 | + "Note: you may need to restart the kernel to use updated packages.\n" |
| 167 | + ] |
| 168 | + } |
| 169 | + ], |
139 | 170 | "source": [
|
140 | 171 | "%pip install pynytimes"
|
141 | 172 | ]
|
|
158 | 189 | },
|
159 | 190 | {
|
160 | 191 | "cell_type": "code",
|
161 |
| - "execution_count": null, |
| 192 | + "execution_count": 9, |
162 | 193 | "id": "dce57534-fe5d-45a4-bb97-f2df8dc3d9d6",
|
163 |
| - "metadata": {}, |
| 194 | + "metadata": { |
| 195 | + "tags": [] |
| 196 | + }, |
164 | 197 | "outputs": [],
|
165 | 198 | "source": [
|
166 | 199 | "# Import the NYTAPI object which we'll use to access the API\n",
|
|
169 | 202 | },
|
170 | 203 | {
|
171 | 204 | "cell_type": "code",
|
172 |
| - "execution_count": null, |
| 205 | + "execution_count": 13, |
173 | 206 | "id": "018a6dfa-9343-42b1-9014-165ae7507b0c",
|
174 |
| - "metadata": {}, |
| 207 | + "metadata": { |
| 208 | + "tags": [] |
| 209 | + }, |
175 | 210 | "outputs": [],
|
176 | 211 | "source": [
|
177 | 212 | "# Intialize the NYT API class into an object using your API key\n",
|
|
339 | 374 | },
|
340 | 375 | {
|
341 | 376 | "cell_type": "code",
|
342 |
| - "execution_count": 1, |
| 377 | + "execution_count": null, |
343 | 378 | "id": "12c920bc-bf8c-48a4-a34e-d10e4a23476d",
|
344 | 379 | "metadata": {},
|
345 | 380 | "outputs": [],
|
|
780 | 815 | },
|
781 | 816 | {
|
782 | 817 | "cell_type": "code",
|
783 |
| - "execution_count": 3, |
| 818 | + "execution_count": null, |
784 | 819 | "id": "5dce99e3-4cf1-475e-a7a2-7f545f06d6e3",
|
785 | 820 | "metadata": {},
|
786 | 821 | "outputs": [],
|
|
867 | 902 | "Let's load in the previously saved data:"
|
868 | 903 | ]
|
869 | 904 | },
|
| 905 | + { |
| 906 | + "cell_type": "code", |
| 907 | + "execution_count": null, |
| 908 | + "id": "29685d83-c90b-407e-be03-e76c67a6215c", |
| 909 | + "metadata": { |
| 910 | + "tags": [] |
| 911 | + }, |
| 912 | + "outputs": [], |
| 913 | + "source": [ |
| 914 | + "pd.__version__" |
| 915 | + ] |
| 916 | + }, |
870 | 917 | {
|
871 | 918 | "cell_type": "code",
|
872 | 919 | "execution_count": null,
|
873 | 920 | "id": "cb310d22-895a-4dff-8442-247aa8b02b42",
|
874 |
| - "metadata": {}, |
| 921 | + "metadata": { |
| 922 | + "tags": [] |
| 923 | + }, |
875 | 924 | "outputs": [],
|
876 | 925 | "source": [
|
877 | 926 | "df = pd.read_pickle(\"../data/election2020_articles.pkl\")\n",
|
|
1033 | 1082 | },
|
1034 | 1083 | {
|
1035 | 1084 | "cell_type": "code",
|
1036 |
| - "execution_count": 2, |
| 1085 | + "execution_count": null, |
1037 | 1086 | "id": "e7a2ebef-41a7-4cc1-a553-a4fc33a3cc57",
|
1038 | 1087 | "metadata": {},
|
1039 | 1088 | "outputs": [],
|
|
1284 | 1333 | ],
|
1285 | 1334 | "metadata": {
|
1286 | 1335 | "kernelspec": {
|
1287 |
| - "display_name": "Python 3 (ipykernel)", |
| 1336 | + "display_name": "dlab", |
1288 | 1337 | "language": "python",
|
1289 |
| - "name": "python3" |
| 1338 | + "name": "dlab" |
1290 | 1339 | },
|
1291 | 1340 | "language_info": {
|
1292 | 1341 | "codemirror_mode": {
|
|
1298 | 1347 | "name": "python",
|
1299 | 1348 | "nbconvert_exporter": "python",
|
1300 | 1349 | "pygments_lexer": "ipython3",
|
1301 |
| - "version": "3.8.13" |
| 1350 | + "version": "3.10.12" |
1302 | 1351 | }
|
1303 | 1352 | },
|
1304 | 1353 | "nbformat": 4,
|
|
0 commit comments