AI News, 4. Single Layer Neural Network in TensorFlow

4. Single Layer Neural Network in TensorFlow

The area of Machine Learning has shown a great expansion thanks to the co-development of key areas such as computing, massive data storage and Internet technologies.

Examples of technologies such as speech recognition, image classification on our phones or detection of spam emails, have enabled apps that a decade ago would have only sounded possible in science fiction.

Being one of the main promoters of the library we developed at Berkeley (Caffe) in 2012 as a PhD student, I can say that TensorFlow, presented in this book and also designed by Google (California), where I have been researching since 2013, will be one of the main tools that researchers and SME companies will use to develop their ideas about Deep Learning and Machine Learning.

My research focus is gradually moving from supercomputing architectures and runtimes to execution middleware’s for big data workloads, and more recently to platforms for Machine Learning on massive data.

In the first chapter, in addition to an introduction to the scenario in which TensorFlow will have an important role, I take the opportunity to explain the basic structure of a TensorFlow program, and explain briefly the data it maintains internally.

In chapter two, through an example of linear regression, I will present some code basics and, at the same time, how to call various important components in the learning process, such as the cost function or the gradient descent optimization algorithm.

In chapter three, where I present a clustering algorithm, I go into detail to present the basic data structure of TensorFlow called tensor, and the different classes and functions that the TensorFlow package offers to create and manage the tensors.

The next chapter begins with an explanation based on neural network concepts seen in the previous chapter and introduces how to construct a multilayer neural network to get a better result in the recognition of handwritten digits.

Last October, when Alphabet announced its quarterly Google’s results, with considerable increases in sales and profits, CEO Sundar Pichai said clearly: “Machine learning is a core, transformative way by which we’re rethinking everything we’re doing”.

 While in TensorFlow it is easier for the developers to build machine learning algorithms and train them for certain types of data inputs, TensorFlow Serving specializes in making these models usable in production environments.

This allows developers to experiment with different models on a large scale that change over time, based on real-world data, and maintain a stable architecture and API in place.

The typical pipeline is that a training data is fed to the learner, which outputs a model, which after being validated is ready to be deployed to the TensorFlow serving system. It is quite common to launch and iterate on our model over time, as new data becomes available, or as you improve the model.

In fact, in the google post [4] they mention that at Google, many pipelines are running continuously, producing new model versions as new data becomes available.

Finally, when you’ve finished, you should disable the virtual environment as follows: Given the introductory nature of this book, we suggest thatthe reader visits the mentioned official documentation page to find more information about other ways to install Tensorflow.

With this simple example, I tried to introduce the idea that the normal way to program in TensorFlow is to specify the whole problem first, and eventually create a session to allow the running of the associated computation.

The representation of the information as a graph allows TensorFlow to know the dependencies between transactions and assigns operations to devices asynchronously, and in parallel, when these operations already have their associated tensors (indicated in the edges input) available.

Based on this example, I will present some code basics and, at the same time, how to call various important components in the learning process, such as the cost function or the algorithm gradient descent.

Remember that both, in the case of two variables (simple regression) and the case of more than two variables (multiple regression), linear regression models the relationship between a dependent variable, independent variables xi and a random term b.

In this section I will create a simple example to explain how TensorFlow works assuming that our data model corresponds to a simple linear regression as y = W * x + b.

The code we have created is as it follows: As you can see from the code, we have generated points following the relationship y = 0.1 * x + 0.3, albeit with some variation, using a normal distribution, so the points do not fully correspond to a line, allowing us to make a more interesting example.

The reader can view them with the following code (in this case, we need to import some of the functions of matplotlib package, running pip install matplotlib[13]): These points are the data that we will consider the training dataset for our model.

The objective is to generate a TensorFlow code that allows to find the best parameters W and b, that from input data x_data, adjunct them to y_data output data, in our case it will be a straight line defined by y_data = W * x_data + b .

standard way to solve such problems is to iterate through each value of the data set and modify the parameters W and b in order to get a more precise answer every time.

To do this, first we will create three variables with the following sentences: For now, we can move forward knowing only that the call to the method Variable is defining a variable that resides in the internal graph data structure of TensorFlow, of which I have spoken above.

Now, with these variables defined, we can express the cost function that we discussed earlier, based on the distance between each point and the calculated point with the function y= W * x + b.

In TensorFlow this cost function is expressed as follows: As we see, this expression calculates the average of the squared distances between the y_data point that we know, and the point y calculated from the input x_data.

At a theoretical level gradient descent is an algorithm that given a function defined by a set of parameters, it starts with an initial set of parameter values and iteratively moves toward a set of values that minimize the function.

The algorithm begins with the initial values of a set of parameters (in our case W and b), and then the algorithm is iteratively adjusting the value of those variables in a way that, in the end of the process, the values of the variables minimize the cost function.

To use this algorithm in TensorFlow, we just have to execute the following two statements: Right now, this is enough to have the idea that TensorFlow has created the relevant data in its internal data structure, and it has also implemented in this structure an optimizer that may be invoked by train, which it is a gradient descent algorithm to the cost function defined.

Also, because in the code we have specified variables, we must initialize them previously with the following calls: Now we can start the iterative process that will allow us to find the values of W and b, defining the model line that best fits the points of entry.

In our particular example, if we assume that with only 8 iterations is sufficient, the code could be: The result of running this code show that the values of W and b are close to the value that we know beforehand.

In my case, the result of the print is: And, if we graphically display the result with the following code: We can see graphically the line defined by parameters W = 0.0854 and b = 0.299 achieved with only 8 iterations:

We can use the following sentence to print the values of W and b: In our case the print outputs are: You can observe that the algorithm begins with the initial values of W= -0.0484 and b=0.2972 (in our case) and then the algorithm is iteratively adjusting in a way that the values of the variables minimize the cost function.

When TensorFlow runs gradient descent search, it will start from some location on this surface (in our example the point W= -0.04841119 and b=0.29720169) and move downhill to find the line with the lowest error.

Here you will find all together for easy tracking: In this chapter we have begun to explore the possibilities of the TensorFlow package with a first intuitive approach to two fundamental pieces: the cost function and gradient descent algorithm, using a basic linear regression algorithm for their introduction.

[contents index] Linear regression, which has been presented in the previous chapter, is a supervised learning algorithm in which we use the data and output values (or labels) to build a model that fits them.

The following table shows the relationship between them in order to make easier the Tensor Flow documentation’s traking easier: These tensors can be manipulated with a series of transformations that supply the TensorFlow package.

If we obtain the shape of this tensor with the get_shape() operation, we can see that there is no associated size: print expanded_vectors.get_shape() It appears on the screen like: TensorShape([Dimension(1), Dimension(2000), Dimension(2)]) Later in this chapter, we will see that, thanks to TensorFlow shape broadcasting, many mathematical manipulation functions of tensors (as presented in the first chapter), are able to discover for themselves the size in the dimension which unspecific size and assign to it this deduced value.

The process is not complex, and given the introductory nature of this book I invite the reader to visit the website of TensorFlow[19] for more details on how to download data from different file types.

TensorFlow offers a collection of operations that produce random tensors with different distributions: An important detail is that all of these operations require a specific shape of the tensors as the parameters of the function, and the variable that is created has the same shape.

The result of the algorithm is a set of K dots, called centroids, which are the focus of the different groups obtained, and the tag that represents the set of points that are assigned to only one of the K clusters.

The steps of allocation (step 1) and updating (step 2) are being alternated in a loop until it is considered that the algorithm has converged, which may be for example when allocations of points to groups no longer change.

I propose to do something simple, like generating 2,000 points in a 2D space in a random manner, following two normal distributions to draw up a space that allows us to better understand the outcome.

To display the points that have been generated randomly I suggest the following code: This code generates a graph of points in a two dimensional space like the following screenshot:

suggest the reader checks the result in the assignment_values tensor with the following code, which generates a graph as above: The screenshot with the result of the execution of my code it is shown in the following figure:

The aim is to extend both tensors from 2 dimensions to 3 dimensions to make the sizes match in order to perform a subtraction: tf.expand_dims inserts one dimension in each tensor;

Therefore, in the allocation step (step 1) the algorithm can be expressed in these four lines of TensorFlow´s code, which calculates the Squared Euclidean Distance: And, if we look at the shapes of tensors, we see that they are respectively for diff, sqr, distances and assignments as follows: That is, tf.sub function has returned the tensor dist, that contains the subtraction of the index values for centroids and vector (indicated in the dimension D1, and the centroid indicated in the dimension D0.

In the table below you can find a summary of the most important ones: Finally, the assignation is achieved with tf.argmin, which returns the index with the minimum value of the tensor dimension (in our case D0, which remember that was the centroid).

We also have the tf.argmax operation: In fact, the 4 instructions seen above could be summarized in only one code line, as we have seen in the previous section: But anyway, internal tensors and the operations that they define as nodes and execute the internal graph are like the ones we have described before.

In the code of the section before we have seen this line of code: On that piece of code, we can see that the means tensor is the result of the concatenation of the k tensors that correspond to the mean value of every point that belongs to each k cluster.

Next, I will comment on each of the TensorFlow operations that are involved in the calculation of the mean value of every points that belongs to each cluster[23]: Anyway, if the reader wants to dig deeper into the code, as I always say, you can find more info for each of these operations, with very illustrative examples, on the TensorFlow API page[24].

To do this, we need to create an operator that assigns the value of the variable means tensor into centroids in a way than, when the operation run() is executed, the values of the updated centroids are used in the next iteration of the loop: We also have to create an operator to initialize all of the variable before starting to run the graph: At this point everything is ready.

If the reader is interested in learn more about the theoretical concepts of this example after reading this chapter, I suggest to read Neural Networks and Deep Learning [28], available online, presenting this example but going in depth with the theoretical concepts.

The MNIST data-set is composed by a set of black and white images containing hand-written digits, containing more than 60.000 examples for training a model, and 10.000 for testing it.

This data-set is ideal for most of the people who begin with pattern recognition on real examples without having to spend time on data pre-processing or formatting, two very important steps when dealing with images but expensive in time.

Only to mention that when we reduce the structure to 2 dimensions, we can be losing part of the information, and for some computer vision algorithms this could affect their result, but for the simplest method used in this tutorial this will not be a problem.

Although the book doesn’t focus on the theoretical concepts of neural netwoks, a brief and intuitive introduction of how neurons work to learn the training data will help the reader to undertand what is happening.

In this situation, the input data is represented by vectors shaped as (x,y) representing the coordinates in this 2-dimension space, and our function returning ‘0’ or ‘1’ (above or below the line) to know how to classify it as a “square” or “circle”.

If the input z is big enough and negative, “e” powered to a large positive number becomes also a large positive number, so the denominator becomes large and the final y becomes 0.

Just to mention that there is a specific case of neural networks (in which Chapter 5 is based on) where the neurons are organized in layers, in a way where the inferior layer (input layer) receives the inputs, and the top layer (output layer) produces the response values.

For example, when we want to classify data into more than two classes at the output layer, we can use the Softmax[34] activation function, a generalization of the sigmoid function. Softmax allows obtaining the probability of each class, so their sum is 1 and the most probable result is the one with higher probability.

For example, our model could predict a “9” in an image with an 80% certainty, but give a 5% of chances to be an “8” (due to a dubious lower trace), and also give certain low probabilities to be any other number.

In this case, we chose a model like the one below, where the red (or bright gray for the b/n edition) represents negative examples (this is, reduce the support for those pixels present in “0”), while the blue (the darker gray for b/n edition) represents the positive examples.

For each i (between 0 and 9) we have a matrix Wi of 784 elements (28×28), where each element j is multiplied by the corresponding component j of the input image, with 784 components, then added bi.

For this purpose, the following schema depicts the data structures and their relations (to help the reader recall easily each piece of our problem): First of all we create two variables to contain the weights W and the bias b: Those variables are created using the tf.Variable function and the initial value for the variables;

In this case of study using MNIST, we also create a tensor of two dimensions to keep the information of the x points, with the following line of code: The tensor x will be used to store the MNIST images as a vector of 784 floating point values (using None we indicate that the dimension can be any size;

In our case, we provide to this function the resulting tensor of multiplying the image vector x and the weight matrix W, adding b: Once specified the model implementation, we can specify the necessary code to obtain the weights for W and bias b using an iterative training algorithm.

To implement the cross-entropy measurement we need a new placeholder for the correct labels: Using this placeholder, we can implement the cross-entropy with the following line of code, representing our cost function: First, we calculate the logarithm of each element y with the built-in function in TensorFlow tf.log(), and then we multiply it for each y_ element.

we will use the backpropagation (backward propagation of errors) algorithm, and as its name indicates, it propagates backwards the error obtained at the outputs to recompute the weights of W, especially important for multi-layer neural network.

This method is used together with the previously seen gradient descent method, which using the cross-entropy cost function allows us to compute how much the parameters must change on each iteration in order to reduce the error using the available local information at each moment.

In our case, intuitively it consist on changing the weights W a little bit on each iteration (this little bit expressed by a learning rate hyperparameter, indicating the speed of change) to reduce the error.

So, in our example using the MNIST images, the following line of code indicates that we are using the backpropagation algorithm to minimize the cross-entropy using the gradient descent algorithm and a learning rate of 0.01: Once here, we have specified all the problem and we can start the computation by instantiating tf.Session() in charge of executing the TensorFlow operations in the available devices on the system, CPUs or GPUs: Next, we can execute the operation initializing all the variables: From this moment on, we can start training our model.

we have to indicate the following lines of code: The first line inside the loop specifies that, for each iteration, a bundle of 100 inputs of data, randomly sampled from the training data-set, are picked.

To determine which fractions of predictions are correct, we can cast the values to numeric variables (floating point) and do the following operation: For example, [True, False, True, True] will turn into [1,0,1,1] and the average will be 0.75 representing the percentage of accuracy.

Just to provide a global view of it, I’ll put it here together: [contents link] Book errata: (chapter 5) it should be 8×8 not 7×7 (link) In this chapter I will program, with the reader, a simple Deep Learning neural network using the same MNIST digit recognition problem of the previous chapter.

In the rest of this chapter, I will use an example code as the backbone, alongside which I will explain the two most important concepts of these networks: convolutions and pooling without entering in the details of the parameters, given the introductory nature of this book.

Let’s have a look at our MNIST digit recognition example: after reading in the MNIST data and defining the placeholders using TensorFlow as we did in the previous example: We can reconstruct the original shape of the images of the input data.

We can do this as follows: Here we changed the input shape to a 4D tensor, the second and third dimension correspond to the width and the height of the image while the last dimension corresponding number of color channels, 1 in this case.

Analyzing the concrete case that we have proposed, we observe that given an input image of size 28×28 and a window of size 5×5 leads to a 24×24 space of neurons in the first hidden layer due to the fact we can only move the window 23 times down and 23 times to the right before hitting the bottom right edge of the input image.

In order to simplify the code, I define the following two functions related to the weight matrix W and bias b: Without going into the details, it is customary to initialize the weights with some random noise and the bias values slightly positive.

We must define a tensor to hold this weight matrix W with the shape [5, 5, 1, 32]: the first two dimensions are the size of the window, and the third is the amount of channels, which is 1 in our case.

Using the previously defined functions we can write this in TensorFlow as follows: The ReLU (Rectified Linear unit) activation function has recently become the default activation function used in the hidden layers of deep neural networks.

The code that we are writing will first apply the convolution to the input images x_image, which returns the results of the convolution of the image in a 2D tensor W_conv1 and then it sums the bias to which finally the ReLU activation function is applied.

In this case we have to pass 32 as the number of channels that we need as that is the output size of the previous layer: The resulting output of the convolution has a dimension of 7×7 as we are applying the 5×5 window to a 12×12 space with a stride size of 1.

The tensors for the weights and biases are as follows: Remember that the first dimension of the tensor represents the 64 filters of size 7×7 from the second convolutional layer, while the second parameter is the amount of neurons in the layer and is free to be chosen by us (in our case 1024).

This is achieved by multiplying the weight matrix W_fc1 with the flattend vector, adding the bias b_fc1 after wich we apply the ReLU activation function: The next step will be to reduce the amount of effective parameters in the neural network using a technique called dropout.

The softmax layer code is as follows: We are now ready to train the model that we have just defined by adjusting all the weights in the convolution, and fully connected layers to obtain the predictions of the images for which we have a label.

The following code is very similar to the one in the previous chapter, with one exception: we replace the gradient descent optimizer with the ADAM optimizer, because this algorithm implements a different optimizer that offers certain advantages according to the literature [42].

If we want a specific operation to be executed in a specific device, instead of letting the system select automatically a device we can use the variable tf.device to create a device context, so all the operations in that context will have the same device assigned.

To conclude this brief chapter, we present a snippet of code inspired on the one shared by DamienAymeric in Github[46], computing An+Bn for n=10 comparing the execution time with 1 GPU against 2 GPUs, using the datetime Python package.

First of all, we import the required libraries: We create two matrix with random values, using the numpy package: Then, we create the two structures to store the results: Next, we define the matpow() function as follows: As we’ve seen, to execute the code in a single GPU, we have to specify this as follows: And for the case with 2 GPUs, the code is as follows: Finally we print the results for the registered computation time: As I previously said at the beginning of this chapter, on February 2016 Google released the distributed version of TensorFlow, which is supported by gRPC, a high performance open source RPC framework for inter-process communication (the same protocol used by TensorFlow Serving).

An important thing: on the following day that TensorFlow was release by Google, I read in a tweet[49] that during the period 2010-2014 a new Deep learning package was released every 47 days, and in 2015 releases were published every 22 days.

A lot of research is still necessary in order to integrate the best analytics knowledge with new Big Data technologies and the awesome power of emerging computational systems in order to interpret massive amounts of heterogeneous data at an unprecedented rate.

Scientific progress is typically the result of an interdisciplinary, long and sustained effort by a large community rather than a breakthrough, and deep learning, and machine learning in general, is not an exception.

We are entering into an extremely exciting period for interdisciplinary research, where ecosystems like the ones found in Barcelona as UPC and BSC-CNS, with deep knowledge in High Performance Computing and Big Data Technologies, will play a big role in this new scenario.

Escribir un libro requiere motivación pero también mucho tiempo, y por ello quiero empezar agradeciendo a mi familia el apoyo y la comprensión que ha mostrado ante el hecho de que un portátil compartiera con nosotros muchos fines de semana y parte de las vacaciones de Navidad desde que Google anunciara que liberaba TensorFlow el pasado noviembre.

Oriol Vinyals le quiero agradecer muy sinceramente su disponibilidad y entusiasmo por escribir el prólogo de este libro, que ha sido para mí el primer gran reconocimiento al esfuerzo realizado.

A Oriol lo conocí hace un par de años después de intercanviar unos cuantos correos electrónicos y en persona el año pasado.

Realmente, un crack del tema de quien nuestro país se debería sentir muy orgulloso e intentar seducirlo para que algún día deje Silicon Valley y venga a Barcelona a fundar aquí nuestro propio Silicon Valley mediterráneo.

Como avanzaba en el prefacio del libro, un antiguo alumno licenciado en físicas e ingeniero en informática, además de haber sido uno de los mejores becarios de investigación que he tenido en el BSC, ha jugado un papel muy importante en esta obra.

Se trata de Ferran Julià, que junto con Oriol Núñez han fundado una startup, con sede en mi comarca, en la que se preparan para analizar imágenes con redes neuronales convolucionales, entre otras muchísimas cosas que ofrece UNDERTILE.

Este hecho ha permitido que Ferran Julià haya hecho a la perfección el rol de editor en este libro, incidiendo en forma y contenidos, de la misma manera que lo hizo mi editor Llorenç Rubió cuando publiqué con la editorial Libros de Cabecera mi ópera prima.

Ahora bien, a Oriol Núñez le agradezco profundamente la idea que compartió conmigo de ampliar las posibilidades de este libro y hacer llegar sus beneficios a muchas más personas de las que yo tenía en mente originalmente a través de su proyecto conjunto con la Fundació El Maresme para la integración social y la mejora de la calidad de vida de las personas con discapacidad intelectual de mi comarca.

Mi más sincero agradecimiento a todos aquellos que han leído parcial o totalmente esta obra antes de ver la luz.

En especial, a un importante data scientist como es Aleix Ruiz de Villa, quién me ha reportado interesantes comentarios para incluir en la versión que tienen en sus manos.

Han sido muchos expertos en este tema que no conozco personalmente los que también me han ayudado en este libro, permitiéndome que compartiera sus ideas e incluso sus códigos, y por ello menciono en detalle las fuentes en los apartados correspondientes en este libro, más como muestra de agradecimiento que no para que el lector lo tenga que consultar.

UPC Barcelona Tech, que ha sido el entorno de trabajo que me ha permitido realizar mi investigación sobre estos temas y acumular los conocimientos que aquí quiero compartir.

Universidad que además me ofrece dar clases en la Facultat d’Informàtica de Barcelona, a unos alumnos brillantes, quienes me animan a escribir obras como esta.

Centro Nacional de Computación (BSC) y en especial a su director Mateo Valero, y los directores de Computer Science Jesús Labarta y Eduard Ayguadé, quienes me han permitido y apoyado siempre esta “dèria” que tengo de tener que estar “parant l’orella” a les tecnologías que vendrán.

Especialmente me gustaría mencionar a dos de mis colegas de la UPC, con quien estoy codo a codo iniciando esta rama de investigación más de “analítica”: Rubèn Tous y Joan Capdevila han mostrado fe ciega en mis “dèrias” de exploración de nuevos temas para conseguir que nuestros conocimientos puedan aportar a esta nueva área llamada High-Performance Big-Data Analytics.

Relacionado con ellos, agradecer a otro gran data scientist, Jesús Cerquides, del Artificial Intelligence Research Institute del CSIC , de quien a través de la codirección de una tesis doctoral estoy descubriendo una nueva y apasionante galaxia en el universo del Machine Learning.

no puedo olvidarme de quien aprendo muchísimo, estudiantes que su trabajo final de máster trata estos temas: Sana Imtiaz o Andrea Ferri.

Hablando de GPUs, gracias a Nacho Navarro, responsable del BSC/UPC NVIDIA GPU Center of Excellence, por facilitarme desde el primer momento el uso de sus recursos para “entrenar” a mis redes neuronales.

Mi agradecimiento al catedrático de la UPC Ricard Gavaldà, uno de los mejores data scientist con los que cuenta el país, que con mucha paciencia me llevo de la mano en mis inicios en este inhóspito para mi, pero apasionante, mundo del Machine Learning a mediados del 2006 creando junto con Toni Moreno-García, Josep Ll Berra y Nico Poggi el primer team híbrido de Data Scientist con Computer Engineers, ¡momentos inolvidables!

Gracias a esa experiencia nuestro grupo de investigación incorporó el Machine Learning con resultados tan brillantes como las tesis de Josep Ll.

Berral, Javier Alonso o Nico Poggi, donde usábamos el Machine Learning en la gestión de recursos de los complejos sistemas de computación actuales.

Pero no fue hasta unos años más tarde, en 2014, cuando con la incorporación al grupo de Jordi Nin y posteriormente de Jose A.

Y a Mauro Cavaller, un gran colaborador de Màrius, cuya contribución fue clave en mi opera prima, me ha aportado esta vez una última revisión formal.

Oriol Pedrera, un genio que domina todas las artes plásticas, que me ha acompañado con mucha paciencia en las diversas técnicas que he ido usando para realizar las ilustraciones del libro, que junto con Júlia Torres y Roser Bellido hemos ido concretando una y otra vez hasta encontrar la versión que encuentran en este libro.

En la parte artística no quisiera olvidarme del gran ebanista Rafa Jiménez quien acepto, sin rechistar, construirme a medida mi mesa de dibujo.

Agradecer al meetup grup d’estudis de machine learning de Barcelona por acoger la presentación oficial del libro, y a Oriol Pujol per aceptar impartir la conferencia que ha acompañado esta presentación del libro en el meetup.

también , muchas gracias a las entidades que me han ayudado a hacer difusión de la existencia de esta obra: la Facultad de Informática de Barcelona (FIB), la aceleradora de proyectos tecnológicos ITNIG, el Col·legi Oficial d’Enginyers Informàtics (COEINF), la Associació d’Antics Alumnes de la FIB (FIBAlumni), el portal de tecnología TECNONEWS, el portal iDigital y el Centre d’Excel·lència en Big Data a Barcelona (Big Data CoE de Barcelona).

Para acabar, una mención especial a la “penya cap als 50”, la “colla dels informàtics” que después de 30 años todavía hacemos encuentros que dejan a uno cargadísimo de energía.

El caso es que aquel fin de semana de noviembre que a la gente de Google desde Silicon Valley se le ocurrió sacar a la luz TensorFlow, yo lo pase con esta peña.

Si yo no me hubiera cargado las pilas con ellos durante ese fin de semana, les aseguro que al día siguiente, cuando me planteé enfrascarme en escribir este libro no habría tenido la energía necesaria.

Jordi Torres is a professor at UPC Barcelona Tech and a research manager and senior advisor at Barcelona Supercomputing Center with a wide range of research and teaching activities for over 25 years. With a great background as a Computer Engineer, his explorer and entrepreneurial spirit have led him to be a Big-Data engineer able to engage with Data Scientists. Actually, his research focus is gradually moving from supercomputing architectures and runtimes to execution middleware’s for big data workloads, and more recently to platforms for Machine Learning on massive data.

More information on page http://www.bsc.es The Universitat Politècnica de Catalunya · BarcelonaTech (UPC) is a public institution dedicated to higher education and research, specialised in the fields of engineering, architecture and science.

As many other machine learning meetups, regular meetings are organized with a two-fold objective: learn about machine learning (from experiences, applications to algorithms, models and theory) and meet people with similar interest to build a wide and supporting community.

217 Jordi Girona 1-3 08034 Barcelona Cover design: Jordi Torres Ilustrations: Jordi Torres Orthographic and typographic proofreader: Laura Juan Merino Editor: Ferran Julià Massó Publisher: Jordi Torres , BSC-CNS Citation: First contact with TensorFlow, get started with Deep Learing programming Jordi Torres, Ed.

TensorFlow in 5 Minutes (tutorial)

Only a few days left to signup for my Decentralized Applications course! This video is all about building a handwritten digit image ..

Hello World - Machine Learning Recipes #1

Six lines of Python is all it takes to write your first machine learning program! In this episode, we'll briefly introduce what machine learning is and why it's ...

TensorFlow Tutorial | Deep Learning Using TensorFlow | TensorFlow Tutorial Python | Edureka

TensorFlow Training - ) This Edureka TensorFlow Tutorial video (Blog: will help .

Build a Neural Net in 4 Minutes

Only a few days left to signup for my Decentralized Applications course! How does a Neural network work? Its the basis of deep ..

Training Custom Object Detector - TensorFlow Object Detection API Tutorial p.5

Welcome to part 5 of the TensorFlow Object Detection API tutorial series. In this part of the tutorial, we will train our object detection model to detect our custom ...

Train an Image Classifier with TensorFlow for Poets - Machine Learning Recipes #6

Monet or Picasso? In this episode, we'll train our own image classifier, using TensorFlow for Poets. Along the way, I'll introduce Deep Learning, and add context ...

Build a TensorFlow Image Classifier in 5 Min

Only a few days left to signup for my Decentralized Applications course! In this episode we're going to train our own image classifier to ..

Build a Recurrent Neural Net in 5 Min

Only a few days left to signup for my Decentralized Applications course! In this video, I explain the basics of recurrent neural networks

Keras Explained

Only a few days left to signup for my Decentralized Applications course! Whats the best way to get started with deep learning? Keras