This simple example demonstrates how to plug TensorFlow Datasets (TFDS) into a Keras model and submit a training job in HPC environments. Although NJIT is quoted the approach must be the same for NYU.
Copyright 2020 The TensorFlow Datasets Authors, Licensed under the Apache License, Version 2.0
import tensorflow as tf
import tensorflow_datasets as tfds
2023-10-03 09:29:30.258272: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-10-03 09:29:30.258321: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-10-03 09:29:30.258358: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
Set 0: VPN and Login
You need to install a VPN client and establish a VPN connection to NJIT data center. Consult https://ist.njit.edu/vpn for help in doing so.
# select from the two options below and ssh into the HPC server
ssh ucid@HPC_HOST.njit.edu
ssh [email protected]
Step 2: Create and train the model
Plug the TFDS input pipeline into a simple Keras model, compile the model, and train it.
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape= (28 , 28 )),
tf.keras.layers.Dense(128 , activation= 'relu' ),
tf.keras.layers.Dense(10 )
])
model.compile (
optimizer= tf.keras.optimizers.Adam(0.001 ),
loss= tf.keras.losses.SparseCategoricalCrossentropy(from_logits= True ),
metrics= [tf.keras.metrics.SparseCategoricalAccuracy()],
)
model.fit(
ds_train,
epochs= 6 ,
validation_data= ds_test,
)
Epoch 1/6
1/469 [..............................] - ETA: 16:54 - loss: 2.5053 - sparse_categorical_accuracy: 0.0703
20/469 [>.............................] - ETA: 1s - loss: 1.6709 - sparse_categorical_accuracy: 0.5254
41/469 [=>............................] - ETA: 1s - loss: 1.1971 - sparse_categorical_accuracy: 0.6812
62/469 [==>...........................] - ETA: 1s - loss: 0.9548 - sparse_categorical_accuracy: 0.7452
84/469 [====>.........................] - ETA: 0s - loss: 0.8144 - sparse_categorical_accuracy: 0.7816
106/469 [=====>........................] - ETA: 0s - loss: 0.7238 - sparse_categorical_accuracy: 0.8053
128/469 [=======>......................] - ETA: 0s - loss: 0.6602 - sparse_categorical_accuracy: 0.8217
150/469 [========>.....................] - ETA: 0s - loss: 0.6103 - sparse_categorical_accuracy: 0.8352
173/469 [==========>...................] - ETA: 0s - loss: 0.5696 - sparse_categorical_accuracy: 0.8461
196/469 [===========>..................] - ETA: 0s - loss: 0.5361 - sparse_categorical_accuracy: 0.8550
218/469 [============>.................] - ETA: 0s - loss: 0.5094 - sparse_categorical_accuracy: 0.8617
241/469 [==============>...............] - ETA: 0s - loss: 0.4872 - sparse_categorical_accuracy: 0.8675
264/469 [===============>..............] - ETA: 0s - loss: 0.4659 - sparse_categorical_accuracy: 0.8735
287/469 [=================>............] - ETA: 0s - loss: 0.4483 - sparse_categorical_accuracy: 0.8782
310/469 [==================>...........] - ETA: 0s - loss: 0.4336 - sparse_categorical_accuracy: 0.8823
332/469 [====================>.........] - ETA: 0s - loss: 0.4211 - sparse_categorical_accuracy: 0.8856
354/469 [=====================>........] - ETA: 0s - loss: 0.4087 - sparse_categorical_accuracy: 0.8890
376/469 [=======================>......] - ETA: 0s - loss: 0.3992 - sparse_categorical_accuracy: 0.8914
398/469 [========================>.....] - ETA: 0s - loss: 0.3887 - sparse_categorical_accuracy: 0.8942
421/469 [=========================>....] - ETA: 0s - loss: 0.3791 - sparse_categorical_accuracy: 0.8966
444/469 [===========================>..] - ETA: 0s - loss: 0.3716 - sparse_categorical_accuracy: 0.8985
467/469 [============================>.] - ETA: 0s - loss: 0.3625 - sparse_categorical_accuracy: 0.9009
469/469 [==============================] - 4s 4ms/step - loss: 0.3621 - sparse_categorical_accuracy: 0.9011 - val_loss: 0.1925 - val_sparse_categorical_accuracy: 0.9463
Epoch 2/6
1/469 [..............................] - ETA: 35s - loss: 0.1062 - sparse_categorical_accuracy: 0.9766
23/469 [>.............................] - ETA: 1s - loss: 0.1855 - sparse_categorical_accuracy: 0.9467
45/469 [=>............................] - ETA: 0s - loss: 0.1784 - sparse_categorical_accuracy: 0.9497
67/469 [===>..........................] - ETA: 0s - loss: 0.1805 - sparse_categorical_accuracy: 0.9488
89/469 [====>.........................] - ETA: 0s - loss: 0.1765 - sparse_categorical_accuracy: 0.9506
111/469 [======>.......................] - ETA: 0s - loss: 0.1760 - sparse_categorical_accuracy: 0.9505
133/469 [=======>......................] - ETA: 0s - loss: 0.1784 - sparse_categorical_accuracy: 0.9496
155/469 [========>.....................] - ETA: 0s - loss: 0.1765 - sparse_categorical_accuracy: 0.9494
178/469 [==========>...................] - ETA: 0s - loss: 0.1737 - sparse_categorical_accuracy: 0.9505
200/469 [===========>..................] - ETA: 0s - loss: 0.1710 - sparse_categorical_accuracy: 0.9512
222/469 [=============>................] - ETA: 0s - loss: 0.1716 - sparse_categorical_accuracy: 0.9512
244/469 [==============>...............] - ETA: 0s - loss: 0.1695 - sparse_categorical_accuracy: 0.9521
266/469 [================>.............] - ETA: 0s - loss: 0.1676 - sparse_categorical_accuracy: 0.9525
288/469 [=================>............] - ETA: 0s - loss: 0.1670 - sparse_categorical_accuracy: 0.9526
310/469 [==================>...........] - ETA: 0s - loss: 0.1657 - sparse_categorical_accuracy: 0.9528
333/469 [====================>.........] - ETA: 0s - loss: 0.1653 - sparse_categorical_accuracy: 0.9528
355/469 [=====================>........] - ETA: 0s - loss: 0.1644 - sparse_categorical_accuracy: 0.9529
377/469 [=======================>......] - ETA: 0s - loss: 0.1643 - sparse_categorical_accuracy: 0.9530
400/469 [========================>.....] - ETA: 0s - loss: 0.1633 - sparse_categorical_accuracy: 0.9532
423/469 [==========================>...] - ETA: 0s - loss: 0.1616 - sparse_categorical_accuracy: 0.9538
446/469 [===========================>..] - ETA: 0s - loss: 0.1606 - sparse_categorical_accuracy: 0.9542
469/469 [==============================] - ETA: 0s - loss: 0.1602 - sparse_categorical_accuracy: 0.9543
469/469 [==============================] - 1s 3ms/step - loss: 0.1602 - sparse_categorical_accuracy: 0.9543 - val_loss: 0.1392 - val_sparse_categorical_accuracy: 0.9588
Epoch 3/6
1/469 [..............................] - ETA: 32s - loss: 0.1546 - sparse_categorical_accuracy: 0.9609
23/469 [>.............................] - ETA: 1s - loss: 0.1243 - sparse_categorical_accuracy: 0.9640
45/469 [=>............................] - ETA: 0s - loss: 0.1216 - sparse_categorical_accuracy: 0.9648
67/469 [===>..........................] - ETA: 0s - loss: 0.1272 - sparse_categorical_accuracy: 0.9628
90/469 [====>.........................] - ETA: 0s - loss: 0.1230 - sparse_categorical_accuracy: 0.9653
113/469 [======>.......................] - ETA: 0s - loss: 0.1236 - sparse_categorical_accuracy: 0.9649
136/469 [=======>......................] - ETA: 0s - loss: 0.1238 - sparse_categorical_accuracy: 0.9649
159/469 [=========>....................] - ETA: 0s - loss: 0.1227 - sparse_categorical_accuracy: 0.9651
182/469 [==========>...................] - ETA: 0s - loss: 0.1205 - sparse_categorical_accuracy: 0.9658
205/469 [============>.................] - ETA: 0s - loss: 0.1186 - sparse_categorical_accuracy: 0.9664
227/469 [=============>................] - ETA: 0s - loss: 0.1192 - sparse_categorical_accuracy: 0.9661
249/469 [==============>...............] - ETA: 0s - loss: 0.1189 - sparse_categorical_accuracy: 0.9661
271/469 [================>.............] - ETA: 0s - loss: 0.1182 - sparse_categorical_accuracy: 0.9662
293/469 [=================>............] - ETA: 0s - loss: 0.1178 - sparse_categorical_accuracy: 0.9666
316/469 [===================>..........] - ETA: 0s - loss: 0.1185 - sparse_categorical_accuracy: 0.9662
338/469 [====================>.........] - ETA: 0s - loss: 0.1186 - sparse_categorical_accuracy: 0.9662
361/469 [======================>.......] - ETA: 0s - loss: 0.1185 - sparse_categorical_accuracy: 0.9660
384/469 [=======================>......] - ETA: 0s - loss: 0.1177 - sparse_categorical_accuracy: 0.9661
407/469 [=========================>....] - ETA: 0s - loss: 0.1178 - sparse_categorical_accuracy: 0.9663
430/469 [==========================>...] - ETA: 0s - loss: 0.1177 - sparse_categorical_accuracy: 0.9662
453/469 [===========================>..] - ETA: 0s - loss: 0.1173 - sparse_categorical_accuracy: 0.9664
469/469 [==============================] - 1s 2ms/step - loss: 0.1174 - sparse_categorical_accuracy: 0.9664 - val_loss: 0.1084 - val_sparse_categorical_accuracy: 0.9693
Epoch 4/6
1/469 [..............................] - ETA: 32s - loss: 0.0986 - sparse_categorical_accuracy: 0.9688
24/469 [>.............................] - ETA: 0s - loss: 0.1070 - sparse_categorical_accuracy: 0.9684
47/469 [==>...........................] - ETA: 0s - loss: 0.1030 - sparse_categorical_accuracy: 0.9714
70/469 [===>..........................] - ETA: 0s - loss: 0.0953 - sparse_categorical_accuracy: 0.9733
93/469 [====>.........................] - ETA: 0s - loss: 0.0948 - sparse_categorical_accuracy: 0.9733
116/469 [======>.......................] - ETA: 0s - loss: 0.0920 - sparse_categorical_accuracy: 0.9744
139/469 [=======>......................] - ETA: 0s - loss: 0.0917 - sparse_categorical_accuracy: 0.9747
161/469 [=========>....................] - ETA: 0s - loss: 0.0920 - sparse_categorical_accuracy: 0.9746
184/469 [==========>...................] - ETA: 0s - loss: 0.0915 - sparse_categorical_accuracy: 0.9748
206/469 [============>.................] - ETA: 0s - loss: 0.0912 - sparse_categorical_accuracy: 0.9745
228/469 [=============>................] - ETA: 0s - loss: 0.0913 - sparse_categorical_accuracy: 0.9746
251/469 [===============>..............] - ETA: 0s - loss: 0.0923 - sparse_categorical_accuracy: 0.9744
273/469 [================>.............] - ETA: 0s - loss: 0.0920 - sparse_categorical_accuracy: 0.9746
295/469 [=================>............] - ETA: 0s - loss: 0.0923 - sparse_categorical_accuracy: 0.9742
317/469 [===================>..........] - ETA: 0s - loss: 0.0929 - sparse_categorical_accuracy: 0.9740
339/469 [====================>.........] - ETA: 0s - loss: 0.0931 - sparse_categorical_accuracy: 0.9738
361/469 [======================>.......] - ETA: 0s - loss: 0.0924 - sparse_categorical_accuracy: 0.9740
383/469 [=======================>......] - ETA: 0s - loss: 0.0923 - sparse_categorical_accuracy: 0.9741
405/469 [========================>.....] - ETA: 0s - loss: 0.0921 - sparse_categorical_accuracy: 0.9741
428/469 [==========================>...] - ETA: 0s - loss: 0.0917 - sparse_categorical_accuracy: 0.9742
451/469 [===========================>..] - ETA: 0s - loss: 0.0913 - sparse_categorical_accuracy: 0.9743
469/469 [==============================] - 1s 3ms/step - loss: 0.0911 - sparse_categorical_accuracy: 0.9743 - val_loss: 0.0968 - val_sparse_categorical_accuracy: 0.9714
Epoch 5/6
1/469 [..............................] - ETA: 32s - loss: 0.0625 - sparse_categorical_accuracy: 0.9922
23/469 [>.............................] - ETA: 1s - loss: 0.0698 - sparse_categorical_accuracy: 0.9820
45/469 [=>............................] - ETA: 0s - loss: 0.0682 - sparse_categorical_accuracy: 0.9811
68/469 [===>..........................] - ETA: 0s - loss: 0.0707 - sparse_categorical_accuracy: 0.9805
90/469 [====>.........................] - ETA: 0s - loss: 0.0729 - sparse_categorical_accuracy: 0.9793
112/469 [======>.......................] - ETA: 0s - loss: 0.0734 - sparse_categorical_accuracy: 0.9794
135/469 [=======>......................] - ETA: 0s - loss: 0.0752 - sparse_categorical_accuracy: 0.9791
158/469 [=========>....................] - ETA: 0s - loss: 0.0742 - sparse_categorical_accuracy: 0.9794
181/469 [==========>...................] - ETA: 0s - loss: 0.0738 - sparse_categorical_accuracy: 0.9795
204/469 [============>.................] - ETA: 0s - loss: 0.0728 - sparse_categorical_accuracy: 0.9797
227/469 [=============>................] - ETA: 0s - loss: 0.0727 - sparse_categorical_accuracy: 0.9797
249/469 [==============>...............] - ETA: 0s - loss: 0.0715 - sparse_categorical_accuracy: 0.9802
271/469 [================>.............] - ETA: 0s - loss: 0.0704 - sparse_categorical_accuracy: 0.9806
293/469 [=================>............] - ETA: 0s - loss: 0.0705 - sparse_categorical_accuracy: 0.9805
315/469 [===================>..........] - ETA: 0s - loss: 0.0709 - sparse_categorical_accuracy: 0.9804
337/469 [====================>.........] - ETA: 0s - loss: 0.0708 - sparse_categorical_accuracy: 0.9802
360/469 [======================>.......] - ETA: 0s - loss: 0.0720 - sparse_categorical_accuracy: 0.9797
383/469 [=======================>......] - ETA: 0s - loss: 0.0729 - sparse_categorical_accuracy: 0.9791
406/469 [========================>.....] - ETA: 0s - loss: 0.0727 - sparse_categorical_accuracy: 0.9792
429/469 [==========================>...] - ETA: 0s - loss: 0.0731 - sparse_categorical_accuracy: 0.9791
452/469 [===========================>..] - ETA: 0s - loss: 0.0735 - sparse_categorical_accuracy: 0.9791
469/469 [==============================] - 1s 2ms/step - loss: 0.0738 - sparse_categorical_accuracy: 0.9790 - val_loss: 0.0881 - val_sparse_categorical_accuracy: 0.9735
Epoch 6/6
1/469 [..............................] - ETA: 32s - loss: 0.0219 - sparse_categorical_accuracy: 1.0000
24/469 [>.............................] - ETA: 1s - loss: 0.0620 - sparse_categorical_accuracy: 0.9821
47/469 [==>...........................] - ETA: 0s - loss: 0.0629 - sparse_categorical_accuracy: 0.9830
70/469 [===>..........................] - ETA: 0s - loss: 0.0640 - sparse_categorical_accuracy: 0.9830
93/469 [====>.........................] - ETA: 0s - loss: 0.0619 - sparse_categorical_accuracy: 0.9830
116/469 [======>.......................] - ETA: 0s - loss: 0.0614 - sparse_categorical_accuracy: 0.9829
139/469 [=======>......................] - ETA: 0s - loss: 0.0620 - sparse_categorical_accuracy: 0.9823
162/469 [=========>....................] - ETA: 0s - loss: 0.0632 - sparse_categorical_accuracy: 0.9820
185/469 [==========>...................] - ETA: 0s - loss: 0.0630 - sparse_categorical_accuracy: 0.9821
208/469 [============>.................] - ETA: 0s - loss: 0.0624 - sparse_categorical_accuracy: 0.9823
231/469 [=============>................] - ETA: 0s - loss: 0.0634 - sparse_categorical_accuracy: 0.9821
254/469 [===============>..............] - ETA: 0s - loss: 0.0630 - sparse_categorical_accuracy: 0.9822
277/469 [================>.............] - ETA: 0s - loss: 0.0634 - sparse_categorical_accuracy: 0.9819
300/469 [==================>...........] - ETA: 0s - loss: 0.0633 - sparse_categorical_accuracy: 0.9821
322/469 [===================>..........] - ETA: 0s - loss: 0.0633 - sparse_categorical_accuracy: 0.9821
345/469 [=====================>........] - ETA: 0s - loss: 0.0627 - sparse_categorical_accuracy: 0.9821
368/469 [======================>.......] - ETA: 0s - loss: 0.0628 - sparse_categorical_accuracy: 0.9820
390/469 [=======================>......] - ETA: 0s - loss: 0.0625 - sparse_categorical_accuracy: 0.9822
413/469 [=========================>....] - ETA: 0s - loss: 0.0622 - sparse_categorical_accuracy: 0.9823
435/469 [==========================>...] - ETA: 0s - loss: 0.0618 - sparse_categorical_accuracy: 0.9823
457/469 [============================>.] - ETA: 0s - loss: 0.0618 - sparse_categorical_accuracy: 0.9823
469/469 [==============================] - 1s 2ms/step - loss: 0.0617 - sparse_categorical_accuracy: 0.9823 - val_loss: 0.0793 - val_sparse_categorical_accuracy: 0.9749
<keras.src.callbacks.History at 0x7fc41e0cb880>
Back to top