AI News, Microsoft is taking autocorrect to the next level
Microsoft is taking autocorrect to the next level
The standard version didn’t pick up on three missing determiners while the prototype Windows ML-powered version highlighted the three nouns that were missing their determiners. “We’ve trained the grammar checker and it now can suggest corrections that I can take action on and fix,”
“We’re running this on Windows ML, which enables Word to build an experience that is low-latency, has high scalability because there are a lot of Word users out there, and it can work offline.” The big news here is that Microsoft’s products, such as Word, are now relying on machine learning algorithms running locally on a Windows 10 device, and not in the cloud.
And because these algorithms are running locally within apps installed on a device, the results are extremely quick. Group program manager Kam VedBrat introduced the new Windows ML application programming interface (API) in March, a platform that enables developers to implement pre-trained machine learning models in their apps and experiences.
Microsoft working on machine-learning grammar features for Word
In its day two keynote for Build 2018, Microsoft showed off a brief demonstration of how it is planning to bring the power of machine learning to Word.
On stage, Kevin Gallo, head of the Windows developer platform, displayed an older version of Word compared to the newer version with a machine learning-powered grammar checker built in.
The demo was relatively short and there's no word on when to expect the Windows ML features in Word, but it's an interesting, albeit small, look at how Microsoft is using machine learning in its own apps.
Applying NLP in Sentiment Classification Entity Recognition Using Azure ML and the Team Data Science Process
We recently published two real-world scenarios demonstrating how to use Azure Machine Learning alongside the Team Data Science Process (TDSP) to execute AI projects involving Natural Language Processing (NLP) use-cases, namely, for sentiment classification and entity extraction.
The samples use a variety of Azure data platforms, such as Data Science Virtual Machines (DSVMs) to train DNN models for sentiment classification and entity extraction using GPUs, and HDInsight Spark for data processing and word embedding model training at scale.
The samples show how domain-specific word embeddings generated using domain-specific and labeled training data sets outperforms generic word embeddings trained on general and unlabeled data, which leads to improved accuracy in classification and entity extraction tasks.
The data used in this project is the Sentiment140 dataset, which contains the text of the tweets (with emoticons removed) along with the polarity of each of the tweets (positive and negative, neutral tweets are removed for this project).
Skip-gram is a shallow neural network taking the target word encoded as a one hot vector as input and using it to predict nearby words.
2014 tries to overcome the weakness of the Word2vec algorithm whereby words with similar contexts and opposite polarity can have similar word vectors.
We are using a simplified variant of SSWE here implemented as a convolutional neural network (CNN) designed to optimize the cross-entropy of sentiment classes as the loss function.
Our use case scenario focuses on how a large amount of unstructured data corpus such as Medline PubMed abstracts can be analyzed to train a word embedding model.
Our results show that the biomedical entity extraction model training on the domain-specific word embedding features outperforms the model trained on the generic feature type (using Google News data).
We also showed that customization of the word embedding approach or the use of domain-specific data-sets for word embeddings can improve the accuracy of subsequent tasks such as classification or entity extraction.
Stock Market Predictions with Natural Language Deep Learning
OverviewWe recently worked with a financial services partner to develop a model to predict the future stock market performance of public companies in categories where they invest.
Our results demonstrate how a deep learning model trained on text in earnings releases and other sources could provide a valuable signal to an investment decision maker.The ChallengeWhen reviewing investment decisions, a firm needs to utilize all possible information, starting with publicly available documents like 10-K reports.
Our challenge was to build a predictive model that could do a preliminary review of these documents more consistently and economically, allowing investment analysts to focus their follow-up analysis time more efficiently and resulting in better investment decisions.For this project, we sought to prototype a predictive model to render consistent judgments on a company’s future prospects, based on the written textual sections of public earnings releases extracted from 10k releases and actual stock market performance.
While there are broader potential applications of processing public earnings release narratives to predict future stock value, for the purposes of this project we focused just on generating predictions that could better inform further human analysis by our partner.Our ApproachTooling, Pre-Processing and Initial NLP ExplorationWe began our work in Python with Azure Machine Learning Workbench, exploring our data with the aid of the integrated Jupyter Notebook.
overview acad overview we are biopharmaceutical company focused discovery development commercialization small molecule drugs treatment central nervous system disorders we currently have six clinical programs several additional programs discovery development our most advanced program we are conducting phase iii studies pimavanserin treatment parkinson disease psychosis we have reported positive results phase ii trial our program pimavanserin co therapy schizophrenia we also have completed enrollment phase iib trial our program acp stand alone treatment schizophrenia addition we have completed proof concept clinical study pimavanserin treatment sleep maintenance insomnia healthy older adults we have retained worldwide1business overview acad overview we are biopharmaceutical company focused discovery development commercialization small molecule drugs treatment central nervous system disorders we currently have six clinical programs several additional programs discovery development our most advanced program we are conducting phase iii studies pimavanserin treatment parkinson disease psychosis we have reported positive results phase ii trial our program pimavanserin co therapy schizophrenia we also have completed enrollment phase iib trial our program acp stand alone treatment schizophrenia addition we have completed proof concept clinical study pimavanserin treatment sleep maintenance insomnia healthy older adults we have retained worldwideAn important consideration in our approach was our limited data sample of less than 35,000 individual text document samples across industries, with much smaller sample sizes within an industry.
We used the GloVe pre-trained model of all of Wikipedia’s 2014 data, a six billion token, 400,000-word vocabulary vector model, chosen for its broad domain coverage and less colloquial nature. This pre-trained set of word vectors allowed us to vectorize our document set and prepare it for deep learning toolkits.Word Vector ExampleAlthough this pre-trained model has a vast 400,000-word vocabulary, it still has limitations as it relates to our text corpus.
We modeled our prototype on just one industry, the biotechnology industry, which had the most abundant within-industry sample. Our project goal was to discern whether we could outperform chance accuracy of 33.33%.Convolutional Neural Net Model with KerasWith our documents represented by a series of embeddings, we were able to take advantage of a convolutional neural network (CNN) model to learn the classifications.
trainable=False)1234567891011121314151617##################################Build the embedding matrix#################################print('Building Embedding Matrix...')embedding_matrix = np.zeros((len(word_index) + 1, EMBEDDING_DIM))for word, i in word_index.items():embedding_vector = embeddings_index.get(word)if embedding_vector is not None:# words not found in embedding index will be all-zeros.embedding_matrix[i] = embedding_vectorembedding_layer = Embedding(len(word_index) + 1,EMBEDDING_DIM,weights=[embedding_matrix],input_length=MAX_SEQUENCE_LENGTH,trainable=False)For the model itself, we employed the ADAM optimizer, the Lecun initializer, and we used exponential linear unit (‘elu’) activation function.
= Model(sequence_input, preds)12345678910111213141516171819202122232425262728293031323334353637#########################Select Model Parameters ########################MAX_SEQUENCE_LENGTH = 10000 #Max sequence of 10k words from each sampleMAX_NB_WORDS = 400000 #Using the full Glove VocabularyEMBEDDING_DIM = 300 #Each word in the sequence represented by 300 valuesVALIDATION_SPLIT = 0.2#Train/Test SplitLEARNING_RATE = .00011BATCH_SIZE = 33DROPOUT_RATE = 0.45#Dropout applied to last layerINNERLAYER_DROPOUT_RATE = 0.15#Dropout applied to inner layers###############1D CNN DESIGN##############sequence_input = Input(shape=(MAX_SEQUENCE_LENGTH,), dtype='int32')embedded_sequences = embedding_layer(sequence_input)x = Conv1D(128, 5, activation='elu', kernel_initializer='lecun_uniform')(embedded_sequences)x = MaxPooling1D(5)(x)x = Dropout(INNERLAYER_DROPOUT_RATE)(x)x = Conv1D(128, 5, activation='elu', kernel_initializer='lecun_uniform')(x)x = MaxPooling1D(5)(x)x = Dropout(INNERLAYER_DROPOUT_RATE)(x)x = Conv1D(128, 5, activation='elu', kernel_initializer='lecun_uniform')(x)x = MaxPooling1D(35)(x)# global max poolingx = Flatten()(x)x = Dense(100, activation='elu', kernel_initializer='lecun_uniform')(x) # best initializers: #glorot_normal #VarianceScaling #lecun_uniformx = Dropout(DROPOUT_RATE)(x)preds = Dense(len(labels_index), activation='softmax')(x) #no initialization in output layermodel = Model(sequence_input, preds)The ResultOur prototype model results, while modest, suggest there is a useful signal available on future performance classification in at least the biotechnology industry based on the target text from the 10-K.
0 1 2 Population
183 183 183 P:
Condition positive 61 62 60 N:
Condition negative 122 121 123 Test
outcome positive 84 39 60 Test
outcome negative 99 144 123 TP:
True Positive 38 29 25 TN:
True Negative 76 111 88 FP:
False Positive 46 10 35 FN:
False Negative 23 33 35 TPR:
False-out 0.377049 0.0826446 0.284553 FDR:
Miss Rate 0.377049 0.532258 0.583333 ACC:
Accuracy 0.622951 0.765027 0.617486 F1
score 0.524138 0.574257 0.4166671234567891011121314151617181920212223242526272829Overall Statistics:Accuracy: 0.50273224043795% CI: (0.42803368164557776, 0.57734086516071881)P-Value [Acc >
In this model, we are seeing 62% accuracy for predicting the under-performing company based on the sample 10-K text.The history of model training and testing is below, trained for 24 epochs.Next StepsThis initial result suggests that that deep learning models trained on text in earnings releases and other sources could prove a viable mechanism to improve the quality of the information available to those making investment decisions, particularly in avoiding investment losses.
While the model needs to be improved with more samples, refinements of domain-specific vocabulary, and text augmentation, it suggests that providing this signal as another decision input for investment analyst would improve the efficiency of the firm’s analysis work.Our partner will look to improve the model with more samples and to augment them with additional information taken from the earnings releases and additional publications and a larger sample of companies.
In addition, they will look to replicate this model for different industries and operationalize the model with Azure Machine Learning Workbench, allowing auto-scaling and custom model management for many clients.Overall, this prototype validated additional investment by our partner in natural language based deep learning to improve efficiency, consistency, and effectiveness of human reviews of textual reports and information.
Please feel free to reach out in comments below or directly via Twitter @SingingData.Featured Image by pixabay.com We recently worked with a financial services partner to develop a model to predict the future stock market performance of public companies in categories where they invest.
We used the GloVe pre-trained model of all of Wikipedia’s 2014 data, a six billion token, 400,000-word vocabulary vector model, chosen for its broad domain coverage and less colloquial nature. This pre-trained set of word vectors allowed us to vectorize our document set and prepare it for deep learning toolkits.
Create text analytics models in Azure Machine Learning Studio
In a text analytics experiment, you would typically: In this tutorial, you learn these steps as we walk through a sentiment analysis model using Amazon Book Reviews dataset (see this research paper “Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification” by John Blitzer, Mark Dredze, and Fernando Pereira;
You can find experiments covered in this tutorial at Azure AI Gallery: Predict Book Reviews Predict Book Reviews - Predictive Experiment We begin the experiment by dividing the review scores into categorical low and high buckets to formulate the problem as two-class classification.
The cleaning reduces the noise in the dataset, help you find the most important features, and improve the accuracy of the final model.
You can also use custom C# syntax regular expression to replace substrings, and remove words by part of speech: nouns, verbs, or adjectives.
In this tutorial, we set N-gram size to 2, so our feature vectors include single words and combinations of two subsequent words.
This approach adds weight of words that appear frequently in a single record but are rare across the entire dataset.
Therefore, we add Extract N-Gram Features module to the scoring branch of the experiment, connect the output vocabulary from training branch, and set the vocabulary mode to read-only.
We also disable the filtering of N-grams by frequency by setting the minimum to 1 instance and maximum to 100%, and turn off the feature selection.
After the text column in test data has been transformed to numeric feature columns, we exclude the string columns from previous stages like in training branch.
It uses the learned N-gram vocabulary to transform the text to features, and trained logistic regression model to make a prediction from those features.
To set up the predictive experiment, we first save the N-gram vocabulary as dataset, and the trained logistic regression model from the training branch of the experiment.
That way, the web service does not request the label it is trying to predict, and does not echo the input features in response.
- On Saturday, March 23, 2019
OCR, Deep Learning & Algorithms: Building Tanmay's Word Search using Tesseract and OCR.Space!
I hope you enjoyed this tutorial! If you did, please make sure to leave a like, comment, and subscribe! It really does help out a lot! Links: tWordSearch Swift Script: ...
Natural Language Processing With Python and NLTK p.1 Tokenizing words and Sentences
Natural Language Processing is the task we give computers to read and understand (process) written text (natural language). By far, the most popular toolkit or ...
Hello World - Machine Learning Recipes #1
Six lines of Python is all it takes to write your first machine learning program! In this episode, we'll briefly introduce what machine learning is and why it's ...
TensorFlow Tutorial | Deep Learning Using TensorFlow | TensorFlow Tutorial Python | Edureka
TensorFlow Training - ) This Edureka TensorFlow Tutorial video (Blog: will help .
Sentiment Analysis in 4 Minutes
Link to the full Kaggle tutorial w/ code: Sentiment Analysis in 5 lines of ..
Handwriting Recognition with Python
On this tutorial, we will use Python's machine learning library, scikitlearn, to predict human handwriting. Ipython Notebook: ...
Python Machine Learning Tutorial | Machine Learning Algorithms | Python Training | Edureka
Python Training : ) This Edureka Python tutorial (Python Tutorial Blog: gives an introduction to Machine .
Machine Learning Lecture 2: Sentiment Analysis (text classification)
In this video we work on an actual sentiment analysis dataset (which is an instance of text classification), for which I also provide Python code (see below).
How Machines Learn
How do all the algorithms around us learn to do their jobs? Bot Wallpapers on Patreon: Discuss this video: ..
How to build machine learning applications using R and Python in SQL Server 2017 - BRK3298
Learn how to use R and Python integration in Microsoft SQL Server 2016/2017 to build machine learning applications. Learn how to operationalize machine ...