Category Archives: Uncategorized

Tensorflow 1.x and 2.x Saving Error Using a `tf.Tensor` as a Python `bool` is not allowed.

Ever tried to save a Tensorflow model with tf.compat.v1.saved_model.simple_save or a similar TF saving method function?

Ever Encounterd in TF 1,X
TypeError: Using a tf.Tensor as a Python bool is not allowed. Use if t is not None: instead of if t: to test if a tensor is defined, and use TensorFlow ops such as tf.cond to execute subgraphs conditioned on the value of a tensor.
or in TF 2.X
OperatorNotAllowedInGraphError: using a tf.Tensor as a Python bool is not allowed in Graph execution. Use Eager execution or decorate this function with @tf.function.

Error
DONT BE FOOLED! This error is not what it seems. At least for me…

This was my save method that was causing the error


token_tensor = tf.ones((input_len,batch_size), "int32", "token_tensor")
segment_tensor = tf.ones((input_len,batch_size), "int32", "segment_tensor")
mask_tensor = tf.ones((input_len,batch_size), "float32", "mask_tensor")
seq_out = model.get_sequence_output()

with tf.compat.v1.Session() as sess:
tf.compat.v1.saved_model.simple_save(
sess,
export_dir,
inputs={'input': token_tensor, 'segment' : segment_tensor, 'mask' : mask_tensor},
outputs=seq_out,
legacy_init_op=init_op
)

See the error? Its very minor…
The problem was, the output tensor IS NOT INSIDE OF A DICT !
Duuuuh! Isn’t that obvious to infer from the

Looking at the source code of the save function is what actually made me see the issue!
simple_save.py

So here is the fix, just define a dict for your input or output tensors!

token_tensor = tf.ones((input_len,batch_size), "int32", "token_tensor")
segment_tensor = tf.ones((input_len,batch_size), "int32", "segment_tensor")
mask_tensor = tf.ones((input_len,batch_size), "float32", "mask_tensor")
seq_out = model.get_sequence_output()

with tf.compat.v1.Session() as sess:
tf.compat.v1.saved_model.simple_save(
sess,
export_dir,
inputs={'input': token_tensor, 'segment' : segment_tensor, 'mask' : mask_tensor},
outputs={"out": seq_out},
legacy_init_op=init_op
)

Happy TensorFlow hacking!

This is the full Stack trace in TF 1.X
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
in ()
45 outputs= seq_out, #{'output': mask_tensor, 'norms': mask_tensor},
46 #outputs={'word_emb': model_wordembedding_output, 'sentence_emb': model_sentence_embedding_output},
---> 47 legacy_init_op=init_op
48 )
49 # print('saving done')

/home/loan/venv/XLNET_jupyter_venv/lib/python2.7/site-packages/tensorflow/python/util/deprecation.pyc in new_func(*args, **kwargs)
322 'in a future version' if date is None else ('after %s' % date),
323 instructions)
--> 324 return func(*args, **kwargs)
325 return tf_decorator.make_decorator(
326 func, new_func, 'deprecated',

/home/loan/venv/XLNET_jupyter_venv/lib/python2.7/site-packages/tensorflow/python/saved_model/simple_save.pyc in simple_save(session, export_dir, inputs, outputs, legacy_init_op)
79 signature_def_map = {
80 signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY:
---> 81 signature_def_utils.predict_signature_def(inputs, outputs)
82 }
83 b = builder.SavedModelBuilder(export_dir)

/home/loan/venv/XLNET_jupyter_venv/lib/python2.7/site-packages/tensorflow/python/saved_model/signature_def_utils_impl.pyc in predict_signature_def(inputs, outputs)
195 if inputs is None or not inputs:
196 raise ValueError('Prediction inputs cannot be None or empty.')
--> 197 if outputs is None or not outputs:
198 raise ValueError('Prediction outputs cannot be None or empty.')
199

/home/loan/venv/XLNET_jupyter_venv/lib/python2.7/site-packages/tensorflow/python/framework/ops.pyc in __nonzero__(self)
702 TypeError.
703 """
--> 704 raise TypeError("Using a tf.Tensor as a Python bool is not allowed. "
705 "Use if t is not None: instead of if t: to test if a "
706 "tensor is defined, and use TensorFlow ops such as "

TypeError: Using a tf.Tensor as a Python bool is not allowed. Use if t is not None: instead of if t: to test if a tensor is defined, and use TensorFlow ops such as tf.cond to execute subgraphs conditioned on the value of a tensor.

And Tensorflow 2.x

---------------------------------------------------------------------------
OperatorNotAllowedInGraphError Traceback (most recent call last)
in
86 inputs=bert_inputs,
87 outputs=table_tensor,
---> 88 legacy_init_op=init_op
89 )
90

~/venv/XLNET_py3_venv/lib/python3.7/site-packages/tensorflow_core/python/util/deprecation.py in new_func(*args, **kwargs)
322 'in a future version' if date is None else ('after %s' % date),
323 instructions)
--> 324 return func(*args, **kwargs)
325 return tf_decorator.make_decorator(
326 func, new_func, 'deprecated',

~/venv/XLNET_py3_venv/lib/python3.7/site-packages/tensorflow_core/python/saved_model/simple_save.py in simple_save(session, export_dir, inputs, outputs, legacy_init_op)
79 signature_def_map = {
80 signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY:
---> 81 signature_def_utils.predict_signature_def(inputs, outputs)
82 }
83 b = builder.SavedModelBuilder(export_dir)

~/venv/XLNET_py3_venv/lib/python3.7/site-packages/tensorflow_core/python/saved_model/signature_def_utils_impl.py in predict_signature_def(inputs, outputs)
195 if inputs is None or not inputs:
196 raise ValueError('Prediction inputs cannot be None or empty.')
--> 197 if outputs is None or not outputs:
198 raise ValueError('Prediction outputs cannot be None or empty.')
199

~/venv/XLNET_py3_venv/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py in __bool__(self)
755 TypeError.
756 """
--> 757 self._disallow_bool_casting()
758
759 def __nonzero__(self):

~/venv/XLNET_py3_venv/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py in _disallow_bool_casting(self)
524 else:
525 # Default: V1-style Graph execution.
--> 526 self._disallow_in_graph_mode("using a tf.Tensor as a Python bool")
527
528 def _disallow_iteration(self):

~/venv/XLNET_py3_venv/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py in _disallow_in_graph_mode(self, task)
513 raise errors.OperatorNotAllowedInGraphError(
514 "{} is not allowed in Graph execution. Use Eager execution or decorate"
--> 515 " this function with @tf.function.".format(task))
516
517 def _disallow_bool_casting(self):

OperatorNotAllowedInGraphError: using a tf.Tensor as a Python bool is not allowed in Graph execution. Use Eager execution or decorate this function with @tf.function.

Python3 Lambda functions – Quick cheat sheet

On this page you will find an overview of Python 3 lambda functions, how they are used and common applications.
Python3 Lambda functions are invaluable for quick and easy data cleaning and many data related tasks and for streaming data processing.

Simple Lambda

Map Lambda

square = lambda x : x*x
print(map(square, [1,2,3,4]) #squares all elements in list

Map Lambda two lists

Filter Lambda

Define a lambda expression, which must evaluate to true, for a list element to be saved in the output list

Reduce Lambda

The reduce function allows us, to accumulate a variable, over a list of inputs. Let’s say we to implement the product function

Lambda If Else with Reduce

With this code, we can implement fizz buz in one line, but we split it up in a view lines so it is easier to understand

Things I wish I knew, before working with Azure- Everything you should know about, before starting with Microsoft Azure!

  • What are resource groups in Azure?
    • What is a resource?
      • Any manageable item that you can rent through Azure is considered a resource. For example, virtual machines, storage accounts, web apps, databases functions and more, basically anything you create and manage in Azure
    • What is a resource provider?
      • Resource providers are the services, that supplies Azure with resources on demand. For example, Microsoft.Compute provides virtual machines. Microsoft.Storage is providing storage as the name implies. The provider gives access to operations on the resources he is providing.
    • What is a resource manager template?
      • The resource manager template defines which resources to deploy to a resource group. With templates, you can define how resources will be made available consistently and also how and which resources to release, when the system is in a critical predefined state.
    • What is a resource group?
      • Resource Groups describe a collection of all building blocks you have defined for your app. If you want to share data between apps or functions, it makes often sense to put them in the same groups, as it also makes exchanging data between them easier.
    • What does deploying a web app mean in azure context?
      • When we deploy a web app in Azure, all we do is just tell Microsoft to rent out a few computer parts for us to run our server! We can define our web app locally and then just upload it to the cloud servers, which will serve our content worldwide!
  • What are the Azure functions
    • Serverless functions in Azure can be defined very simply and connected to any app with minimal effort! The code for the function is stored on azures servers and only invoked when it is triggered by one of the many trigger mechanism. They consist of a trigger, input bindings and output binding which we will explain in detail later on
  • What are Azure Logic apps
    • Logic apps enable you to automate and orchestrate tasks. They are one of the main tools to automate processes and save you precious time! Logic apps even let you combine and concatenate multiple different apps into one! Connect everything with everyone is the motto of this set of features.
  • What is a storage account and why do I need one in Azure?
    • A storage account is a reference to all the data object stored for your account like blobs, files, ques, tables, disks and so on.
  • Redis Cache
    • Instead of renting the normal data storage or the distributed Hadoop storage, you can also rent super fast Redis Cache, which is basically just RAM and highly cachable data storage. Depending on your use case, this can be very valuable for time and efficiency critical operations
  • Power Shell / Bash Shell
    • Microsoft provides a great CLI interface to manage your cloud infrastructure
  • What are the containers?
    • A container is basically a virtualized software. Instead of having to care about the hardware and the operating system, you just ask for a container and in that one, your software project will run. The great thing about containers is, that they are hardware and OS independent, so you can just share your app container with someone and they can run your app with any issues, saving a huge amount of time when deploying software! Using such a container-based design yields more efficient architectures. Containers also let your team work faster, deploy more efficiently and operate at a much larger scale. Using a container also mean, you do not have to set up a whole VM, it is just everything you need to contain the app! This means containers are much more lightweight than VM’s This basically means, your software is decoupled from the hardware and OS, which leaves many developers with much less headache! It also makes for a clean split between infrastructure management and software logic management
  • What are Azure function triggers?
    • Since functions in Azure are serverless, we need to define a trigger, which tells Azure when to call our function. There are many possible triggers we could use, the most common ones get triggered by any changes to the Cosmos DB, the blob storage, the queue storage, and the timer.
  • What are Azure function bindings?
    • Azure function bindings basically define the input and output arguments of any function in Azure.
  • What does serverless mean? Serverless function?
    • In the context of Azure, the are serverless functions and serverless logic apps. But they still run on a server, so how are they related to serverless? The real meaning behind serverless is, that developers do not worry about the servers, it all happens automagically in the backend implemented by Microsofts engineers
  • BONUS : What is the difference between a VM and a container?
    • You can imagien the VM as virtualizing the hardware and a container is virtualizing the software

Things I wish I knew about Azure functions , before working with Azure

Every function in Azure consists of a trigger, input and output bindings and the code defining the function of course!

What are Triggers?

Triggers are mechanisms that trigger the execution of your function. You can setup triggers for a HTTP request, a database update or almost anything.

What are bindings?

The bindings define which resources our function will have access to. It will be provided as a parameter to the function

How to configure bindings and triggers?

Every function is accompanied with a function.json, whcih defines the bindings, the directions and triggers. For compiled languages, so any non scripting language, we do not have to create the function.json file ourselves, since it can be automatically generated from the function code. But for scirpting languages, we must define the function.json ourselves.

What are Durable Functions?

Durable Functions extends Azures classical functions with functions that can have a state AND are still in a serverless enviroment! Durable Functions are also nessecary, if you want to create an Orchestrator Function. The Durable Functions made up of different classical Azure Functions.

What are some Durable Function patterns?

Often, one common pattern for the Durable function is that you chain together a bunch of normal functions and their output it piped to the next function, this together. There are also Fan-out/fan-in patterns, which runs a bunch of functions in parallel and waits for all of then to finish, to return the final result. Then there is Async HTTP Api calls and liek the name implies, it enables us to make API calls that are not synchronous. Also, there is one pattern to program a human in the loob, called human interaction . You can check for more patterns the offical docs here

Intelij plugin for with Microsoft Azure tutorial deploying web app

In this tutorial we will checkout how to get the Microsoft Azure Plugin and how to use it.

First of all, start your IDE and hit Shift two times in quick succession and enter “plugin” to get quickly to the plugin instal menu.

Then just type Azure and install the fisrt plugin suggested, which is developed by microsoft.

After having installed and having created an account on the Azure website, you can login to your account through intelij.

Select the tools tab in the top toolbar and login into azure, using interactive mode and just type in your credentials you just used for making your Azure account.

Prepearing the Ressource groups

I was following this great tutorial from Microsoft, but I and probably a lot of other people encountered an error, when trying to launch a web app right after having created a new account in Azure.

Before you can launch anything in Azure, you need Ressoure groups. Even though the tutorial from Microsoft does not state it explicitly, you should really create an Ressource group, before attempting this.

Here is how you create a resource group :

Login to your Azure account on the Microsoft website and head to “My Account”

Next select “Create a resource”

And then select “Web App”

Enter a name for the App and the resource Group, click Create New

After having created the group, we are finally ready to deploy our app with Intelij!

Start a new project and select a web app in Maven and make sure you are creating the project from archtype!

Then just go to the root folder of your project and right click it in Intelij. You should now see the Azure options, which let you deploy your web app to the cloud!

If you did not login to Azure before, do it now.

Then you have to option to use an existing Web App or a new one. We want a new one, but we will use an existing resource group! For some reason, creating a resource group with intelij plugin, seems to result in exceptions. The only way to avoid those so far, is to create the group manualy in azure and then use that one for further deployment

After hitting run and waiting a few seconds, your console should update with an URL to your freshly deployed web app.

Thanks for reading and have fun in the cloud!

Java Spark Tips, Tricks and Basics 6 – How to broadcast a variable to Spark cluster? Why do we need to broadcast variables?

Why do we need Spark broadcasters?

Spark is all about cluster computing. In a cluster of nodes, each node of course has it’s personal private memory.

If we want all the nodes in the cluster to work towards a common goal,  having shared variables just seems necessary.

Let’s say we want to sum up all the rows in a CSV table with 1 million lines. It makes just sense, to let 1 node work with 1/2 million and the other work with the other 1/2 million rows. Both calculate their results and then the driver program will combine their results.

Broadcasting allows us to create a read-only cached copy of a variable on every node in our cluster. The distribution of those variables is handled by efficient broadcast algorithms implemented by Spark under the hood. This will also take the burden of thinking about serialization and deserialization since good old Spark takes care of that!

This great functionality for broadcasting is provided by the SparkContext class.  Alternatively, one can also consider to use the broadcast class right away, do your work

How to broadcast a variable in Spark Java

What did we learn?

In this short tutorial, you learned what Spark Broadcast is for,  what Broadcast does and how to use it in Java.

Java Spark – Errors while using map function in cluster mode – Spark java

Ever tried a map function, for each function or a simmilar Lambda function and it runs in local mode but you cannot get it running in cluster mode?

Then you just found your solution!

 

First, go to your java root directory and call

 

If you have this error :

 

Or a Stack trace like this :

 

How to fix Java Error “The trustAnchors parameter must be non-empty” while building Maven Single Jar

Recently I tried to deploy my Java  project as a single fat jar. To my bad awekening, Maven did not feel like creating jars anymore and complained with errors like :

You might encounter an error like this, if you recently reinstalled your java and want to build Jar with maven

Or

Or

Or

Or

Or

 

Or

 

Or

 

The console with spit something like this out :

 

 

 

How to fix the error

To fix the error, we will download the java JDK and extract the certificates into our local java installation.

Get java

 

Extract the jdk

 

 

Copy the Certificate into your java directory

 

And now you can finally build your jar with

 

How to setup Hadoop 2.9 Pseudo Cluster mode on a remote PC using SSH

In my <other tutorial>  we learned about what Hadoop is, why Hadoop is so awesome and what Hadoop is used for. No I will show you, how to setup Hadoop 2.9 in Pseudo Cluster mode on a VM using SSH.

Download Hadoop 2.9

wget http://www-eu.apache.org/dist/hadoop/common/hadoop-2.9.0/hadoop-2.9.0-src.tar.gz

Then unzip it
tar -xvzf hadoop-2.9.0-src.tar.gz

Remember where you extracted this to, because we will need to add the path to the Enviroment Variables later!
To get the path use the handy command
pwd

Download SSH and Rsync
sudo apt-get install ssh
sudo apt-get install rsync

Setup SSH connecton to localhost
ssh-keygen -t rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod og-wx ~/.ssh/authorized_keys

Setup Hadoop Enviroment Variables

sudo gedit ~/.bashrc

and enter the following text (and by that adding the following variables)
export HADOOP_HOME=/path/to/hadoop/folder
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin

Next step is to edit the Hadoop-env.sh file located inside of your Hadoop folder in /etc/hadoop/Hadoop-env.sh .
We will add your Java home path to the Hadoop settings.
Change
export JAVA_HOME=${JAVA_HOME}
for
export JAVA_HOME= /usr/lib/jvm/java-8-openjdk-amd64
To make sure you use the right path, write
echo $JAVA_HOME
in your Terminal, to recieve the Java Home Path

Enable Pseudo Cluster Mode

Now we can finally setup the configurations for Hadoop pseudo distributed mode
The necessary files to edit are located inside of the HadoopBase/etc/hadoop folder.

hdfs-site.xml

<property>
<name>dfs.replication</name>
<value>1</value></property>
<property>
<name>dfs.name.dir</name>
<value>file:///home/user/hadoop/data/hdfs/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>file:///home/user/hadoop/data/hdfs/datanode</value>
</property>

mapred-site.xml

<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>

yarn-site.xml

<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>

core-site.xml

<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>

 

 

Then Format the File system

bin/hdfs namenode -format

and we are done!

To see how to run Hadoop check this article out!

What is hadoop and Why is it awesome?

Introduction

Hadoop Provides big companies a mean to distribute and store huge amount of data on not only one computer but multiple!  You can imagine it like you normal Window or Unix Filesystem, but only distributed! At first you might think, wow ok so what, now I have my 4K Video  on 5 different Computers, what do we get from that?

Usually, only one computer supplies us with the data we want, this one computer only has 1 Network connection and limited bandwith. If you have multiple computers in different locations using different connections to the internet, their bandwith sums up and a high perfomance boost will be noticable

You can imagine it, as 1 Person having to deliver a giant rocket consisting of multiple big parts. That one person can only deliver 1 Rocket part at a time. If we use 2 Persons, we already doubled our speed !  The same principle applies to down loading and uploading

So instead of just 1 Computer supplying you with a limited Datastream, you have multiple Computers serving you Data at the same time!

 

IF you want to setup Hadoop on your Local machien check <THIS> out!

If you are intrested in setting  up a Hadoop pseudo cluster check <THIS> out!

If you want to learn about basic java interacting with Hadoop, downloading, uploading from a Distributed File System  , check <THIS> out!

 

Notation

Model

FileSystem Class