Skip to content

Commit 1600c25

Browse files
Merge pull request #1 from GoogleCloudPlatform/master
Merging head repo to base
2 parents ca34e87 + 815d161 commit 1600c25

File tree

12 files changed

+679
-49
lines changed

12 files changed

+679
-49
lines changed
Lines changed: 199 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,199 @@
1+
{
2+
"nbformat": 4,
3+
"nbformat_minor": 0,
4+
"metadata": {
5+
"colab": {
6+
"name": "Welcome To Colaboratory",
7+
"version": "0.3.2",
8+
"provenance": [],
9+
"collapsed_sections": [],
10+
"toc_visible": true,
11+
"include_colab_link": true
12+
},
13+
"kernelspec": {
14+
"name": "python3",
15+
"display_name": "Python 3"
16+
}
17+
},
18+
"cells": [
19+
{
20+
"cell_type": "markdown",
21+
"metadata": {
22+
"id": "view-in-github",
23+
"colab_type": "text"
24+
},
25+
"source": [
26+
"<a href=\"https://colab.research.google.com/github/GoogleCloudPlatform/training-data-analyst/blob/master/courses/fast-and-lean-data-science/colab_intro.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
27+
]
28+
},
29+
{
30+
"cell_type": "markdown",
31+
"metadata": {
32+
"colab_type": "text",
33+
"id": "5fCEDCU_qrC0"
34+
},
35+
"source": [
36+
"<img alt=\"Colaboratory logo\" height=\"45px\" src=\"https://colab.research.google.com/img/colab_favicon.ico\" align=\"left\" hspace=\"10px\" vspace=\"0px\">\n",
37+
"\n",
38+
"<h1>Welcome to Colaboratory!</h1>\n",
39+
"\n",
40+
"Colaboratory is a free Jupyter notebook environment that requires no setup and runs entirely in the cloud.\n",
41+
"\n",
42+
"With Colaboratory you can write and execute code, save and share your analyses, and access powerful computing resources, all for free from your browser."
43+
]
44+
},
45+
{
46+
"cell_type": "markdown",
47+
"metadata": {
48+
"colab_type": "text",
49+
"id": "GJBs_flRovLc"
50+
},
51+
"source": [
52+
"## Running code\n",
53+
"\n",
54+
"Code cells can be executed in sequence by pressing Shift-ENTER. Try it now."
55+
]
56+
},
57+
{
58+
"cell_type": "code",
59+
"metadata": {
60+
"id": "k5QBNS8zPteE",
61+
"colab_type": "code",
62+
"colab": {}
63+
},
64+
"source": [
65+
"import math\n",
66+
"import tensorflow as tf\n",
67+
"from matplotlib import pyplot as plt\n",
68+
"print(\"Tensorflow version \" + tf.__version__)"
69+
],
70+
"execution_count": 0,
71+
"outputs": []
72+
},
73+
{
74+
"cell_type": "code",
75+
"metadata": {
76+
"colab_type": "code",
77+
"id": "gJr_9dXGpJ05",
78+
"colab": {}
79+
},
80+
"source": [
81+
"a=1\n",
82+
"b=2"
83+
],
84+
"execution_count": 0,
85+
"outputs": []
86+
},
87+
{
88+
"cell_type": "code",
89+
"metadata": {
90+
"colab_type": "code",
91+
"id": "-gE-Ez1qtyIA",
92+
"colab": {}
93+
},
94+
"source": [
95+
"a+b"
96+
],
97+
"execution_count": 0,
98+
"outputs": []
99+
},
100+
{
101+
"cell_type": "markdown",
102+
"metadata": {
103+
"id": "UgR1-U1eQKhJ",
104+
"colab_type": "text"
105+
},
106+
"source": [
107+
"## Hidden cells\n",
108+
"Some cells contain code that is necessary but not interesting for the exercise at hand. These cells will typically be collapsed to let you focus at more interesting pieces of code. If you want to see their contents, double-click the cell. Wether you peek inside or not, **you must run the hidden cells for the code inside to be interpreted**. Try it now, the cell is marked **RUN ME**."
109+
]
110+
},
111+
{
112+
"cell_type": "code",
113+
"metadata": {
114+
"id": "pG-s3-a_Ppqv",
115+
"colab_type": "code",
116+
"cellView": "form",
117+
"colab": {}
118+
},
119+
"source": [
120+
"#@title \"Hidden cell with boring code [RUN ME]\"\n",
121+
"\n",
122+
"def display_sinusoid():\n",
123+
" X = range(180)\n",
124+
" Y = [math.sin(x/10.0) for x in X]\n",
125+
" plt.plot(X, Y)"
126+
],
127+
"execution_count": 0,
128+
"outputs": []
129+
},
130+
{
131+
"cell_type": "code",
132+
"metadata": {
133+
"id": "ILZmeORzRaJB",
134+
"colab_type": "code",
135+
"colab": {}
136+
},
137+
"source": [
138+
"display_sinusoid()"
139+
],
140+
"execution_count": 0,
141+
"outputs": []
142+
},
143+
{
144+
"cell_type": "markdown",
145+
"metadata": {
146+
"colab_type": "text",
147+
"id": "lSrWNr3MuFUS"
148+
},
149+
"source": [
150+
"Did it work ? If not, run the collapsed cell marked **RUN ME** and try again!\n"
151+
]
152+
},
153+
{
154+
"cell_type": "markdown",
155+
"metadata": {
156+
"colab_type": "text",
157+
"id": "-Rh3-Vt9Nev9"
158+
},
159+
"source": [
160+
"## Accelerators\n",
161+
"\n",
162+
"Colaboratory offers free GPU anf TPU (Tensor Processing Unit) accelerators.\n",
163+
"\n",
164+
"Use the menu *Runtime > Change runtime type*, select a TPU accelerator and then **run the whole notebook again**.\n",
165+
"\n",
166+
"The cell below should list the eight cores of a TPU.\n"
167+
]
168+
},
169+
{
170+
"cell_type": "code",
171+
"metadata": {
172+
"id": "TDwroXseS27Z",
173+
"colab_type": "code",
174+
"colab": {}
175+
},
176+
"source": [
177+
"try:\n",
178+
" tpu = tf.contrib.cluster_resolver.TPUClusterResolver() # TPU detection\n",
179+
" strategy = tf.contrib.tpu.TPUDistributionStrategy(tpu)\n",
180+
"except ValueError:\n",
181+
" print(\"Running on GPU or CPU\")"
182+
],
183+
"execution_count": 0,
184+
"outputs": []
185+
},
186+
{
187+
"cell_type": "markdown",
188+
"metadata": {
189+
"id": "e4To6dhjTptA",
190+
"colab_type": "text"
191+
},
192+
"source": [
193+
"Did you see the eight TPU cores ?\n",
194+
"\n",
195+
"If not, check under *Runtime > Change runtime type* and try running the entire notebook again."
196+
]
197+
}
198+
]
199+
}

courses/machine_learning/deepdive/01_bigquery/b_bqml.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -377,7 +377,7 @@
377377
"\n",
378378
"There's lots of work going on behind the scenes make this look easy. For example BQML is automatically creating a training/evaluation split, tuning our learning rate, and one-hot encoding features if neccesary. When we move to TensorFlow these are all things we'll need to do ourselves. \n",
379379
"\n",
380-
"This notebook was just to inspire usagage of BQML, the current model is actually very poor. We'll prove this in the next lesson by beating it with a simple heuristic. \n",
380+
"This notebook was just to inspire usage of BQML, the current model is actually very poor. We'll prove this in the next lesson by beating it with a simple heuristic. \n",
381381
"\n",
382382
"We could improve our model considerably with some feature engineering but we'll save that for a future lesson. Also there are additional BQML functions such as `ML.WEIGHTS` and `ML.EVALUATE` that we haven't even explored. If you're interested in learning more about BQML I encourage you to [read the offical docs](https://cloud.google.com/bigquery/docs/bigqueryml).\n",
383383
"\n",

courses/machine_learning/deepdive/02_tensorflow/g_distributed.ipynb

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -88,7 +88,7 @@
8888
"\n",
8989
"Note the new `train_data_path` above. It is ~20,000,000 rows (100x the original dataset) and 1.25GB sharded across 10 files. How did we create this file?\n",
9090
"\n",
91-
"Go to http://bigquery.cloud.google.com/ and paste the query:\n",
91+
"Go to https://console.cloud.google.com/bigquery and paste the query:\n",
9292
"<pre>\n",
9393
" #standardSQL\n",
9494
" SELECT\n",
@@ -118,9 +118,10 @@
118118
"\n",
119119
"Export this to CSV using the following steps (Note that <b>we have already done this and made the resulting GCS data publicly available</b>, so following these steps is optional):\n",
120120
"<ol>\n",
121-
"<li> Click on the \"Save As Table\" button and note down the name of the dataset and table.\n",
121+
"<li> Click on the \"Save Results\" button and select \"BigQuery Table\" (we can't directly export to CSV because the file is too large). \n",
122+
"<li>Specify a dataset name and table name (if you don't have an existing dataset, <a href=\"https://cloud.google.com/bigquery/docs/datasets#create-dataset\">create one</a>). \n",
122123
"<li> On the BigQuery console, find the newly exported table in the left-hand-side menu, and click on the name.\n",
123-
"<li> Click on \"Export Table\"\n",
124+
"<li> Click on the \"Export\" button, then select \"Export to GCS\".\n",
124125
"<li> Supply your bucket and file name (for example: gs://cloud-training-demos/taxifare/large/taxi-train*.csv). The asterisk allows for sharding of large files.\n",
125126
"</ol>\n",
126127
"\n",

courses/machine_learning/deepdive/02_tensorflow/labs/d_csv_input_fn.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -404,7 +404,7 @@
404404
"source": [
405405
"## Challenge exercise\n",
406406
"\n",
407-
"Create a neural network that is capable of finding the volume of a cylinder given the radius of its base (r) and its height (h). Assume that the radius and height of the cylinder are both in the range 0.5 to 2.0. Unlike in the challenge exercise for b_estimator.ipynb, assume that your measurements of r, h and V are all rounded off to the nearest 0.1. Simulate the necessary training dataset. This time, you will need a lot more data to get a good predictor.\n",
407+
"Create a neural network that is capable of finding the volume of a cylinder given the radius of its base (r) and its height (h). Assume that the radius and height of the cylinder are both in the range 0.5 to 2.0. Unlike in the challenge exercise for c_estimator.ipynb, assume that your measurements of r, h and V are all rounded off to the nearest 0.1. Simulate the necessary training dataset. This time, you will need a lot more data to get a good predictor.\n",
408408
"\n",
409409
"Hint (highlight to see):\n",
410410
"<p style='color:white'>\n",

courses/machine_learning/deepdive/02_tensorflow/labs/e_traineval.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -467,7 +467,7 @@
467467
"source": [
468468
"## Challenge exercise\n",
469469
"\n",
470-
"Modify your solution to the challenge exercise in c_dataset.ipynb appropriately."
470+
"Modify your solution to the challenge exercise in d_csv_input.ipynb appropriately."
471471
]
472472
},
473473
{

courses/machine_learning/deepdive/02_tensorflow/labs/f_ai_platform.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -499,7 +499,7 @@
499499
"source": [
500500
"## Challenge exercise\n",
501501
"\n",
502-
"Modify your solution to the challenge exercise in d_trainandevaluate.ipynb appropriately. Make sure that you implement training and deployment. Increase the size of your dataset by 10x since you are running on the cloud. Does your accuracy improve?"
502+
"Modify your solution to the challenge exercise in e_traineval.ipynb appropriately. Make sure that you implement training and deployment. Increase the size of your dataset by 10x since you are running on the cloud. Does your accuracy improve?"
503503
]
504504
},
505505
{

courses/machine_learning/deepdive/02_tensorflow/labs/g_distributed.ipynb

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,10 @@
6363
"source": [
6464
"#### **Exercise 1**\n",
6565
"\n",
66-
"In the cell below, we will submit another (much larger) training job to the cloud. However, this time we'll alter some of the previous parameters. Fill in the missing code in the TODOs below. You can reference the previous `f_ai_platform` notebook if you get stuck. Note that, now we will want to include an additional parameter for `scale-tier` to specify the distributed training environment. You can follow these links to read more about [\"Using Distributed TensorFlow with Cloud ML Engine\"](https://cloud.google.com/ml-engine/docs/tensorflow/distributed-tensorflow-mnist-cloud-datalab) or [\"Specifying Machine Types or Scale Tiers\"](https://cloud.google.com/ml-engine/docs/tensorflow/machine-types#scale_tiers)."
66+
"In the cell below, we will submit another (much larger) training job to the cloud. However, this time we'll alter some of the previous parameters. Fill in the missing code in the TODOs below. You can reference the previous `f_ai_platform` notebook if you get stuck. Note that, now we will want to include an additional parameter for `scale-tier` to specify the distributed training environment. You can follow these links to read more about [\"Using Distributed TensorFlow with Cloud ML Engine\"](https://cloud.google.com/ml-engine/docs/tensorflow/distributed-tensorflow-mnist-cloud-datalab) or [\"Specifying Machine Types or Scale Tiers\"](https://cloud.google.com/ml-engine/docs/tensorflow/machine-types#scale_tiers).\n",
67+
"\n",
68+
"#### **Exercise 2**\n",
69+
"Notice how our `train_data_path` contains a wildcard character. This means we're going to be reading in a list of sharded files, modify your `read_dataset()` function in the `model.py` to handle this (or verify it already does)."
6770
]
6871
},
6972
{
@@ -97,7 +100,7 @@
97100
"\n",
98101
"Note the new `train_data_path` above. It is ~20,000,000 rows (100x the original dataset) and 1.25GB sharded across 10 files. How did we create this file?\n",
99102
"\n",
100-
"Go to http://bigquery.cloud.google.com/ and paste the query:\n",
103+
"Go to https://console.cloud.google.com/bigquery and paste the query:\n",
101104
"<pre>\n",
102105
" #standardSQL\n",
103106
" SELECT\n",
@@ -127,9 +130,10 @@
127130
"\n",
128131
"Export this to CSV using the following steps (Note that <b>we have already done this and made the resulting GCS data publicly available</b>, so following these steps is optional):\n",
129132
"<ol>\n",
130-
"<li> Click on the \"Save As Table\" button and note down the name of the dataset and table.\n",
133+
"<li> Click on the \"Save Results\" button and select \"BigQuery Table\" (we can't directly export to CSV because the file is too large). \n",
134+
"<li>Specify a dataset name and table name (if you don't have an existing dataset, <a href=\"https://cloud.google.com/bigquery/docs/datasets#create-dataset\">create one</a>). \n",
131135
"<li> On the BigQuery console, find the newly exported table in the left-hand-side menu, and click on the name.\n",
132-
"<li> Click on \"Export Table\"\n",
136+
"<li> Click on the \"Export\" button, then select \"Export to GCS\".\n",
133137
"<li> Supply your bucket and file name (for example: gs://cloud-training-demos/taxifare/large/taxi-train*.csv). The asterisk allows for sharding of large files.\n",
134138
"</ol>\n",
135139
"\n",

0 commit comments

Comments
 (0)