AI News, Evolving Graphs and Similarity artificial intelligence
Senior Research Scientist, Adobe Research Member of Research Staff, Palo Alto Research Center Visiting Researcher, Palo Alto Research Center (Xerox PARC) Research Fellow, Purdue University (2009-2012) Research Assistant, Lawrence Livermore National Laboratory
Spring 2008 This course was taught from a machine learning perspective using a
variety of resources and recent papers along with a series of homeworks and projects implementing the
significant parts of a search engine.
assistant I gave lectures and review sessions;
developed homeworks, labs, and programs, held
office hours, and maintained course website.
A recommender system or a recommendation system (sometimes replacing 'system' with a synonym such as platform or engine) is a subclass of information filtering system that seeks to predict the 'rating' or 'preference' a user would give to an item.
Recommender systems are utilized in a variety of areas, and are most commonly recognized as playlist generators for video and music services like Netflix, YouTube and Spotify, product recommenders for services such as Amazon, or content recommenders for social media platforms such as Facebook and Twitter.
These systems can operate using a single input, like music, or multiple inputs within and across platforms like news, books, and search queries.
Collaborative filtering approaches build a model from a user's past behavior (items previously purchased or selected and/or numerical ratings given to those items) as well as similar decisions made by other users.
The user- and item-based nearest neighbor algorithms can be combined to deal with the cold start problem and improve recommendation results using this data.
key advantage of the collaborative filtering approach is that it does not rely on machine analyzable content and therefore it is capable of accurately recommending complex items such as movies without requiring an 'understanding' of the item itself.
One of the most famous examples of collaborative filtering is item-to-item collaborative filtering (people who buy x also buy y), an algorithm popularized by Amazon.com's recommender system.
Many social networks originally used collaborative filtering to recommend new friends, groups, and other social connections by examining the network of connections between a user and their friends.
Simple approaches use the average values of the rated item vector while other sophisticated methods use machine learning techniques such as Bayesian Classifiers, cluster analysis, decision trees, and artificial neural networks in order to estimate the probability that the user is going to like the item.
key issue with content-based filtering is whether the system is able to learn user preferences from users' actions regarding one content source and use them across other content types.
When the system is limited to recommending content of the same type as the user is already using, the value from the recommendation system is significantly less than when other content types from other services can be recommended.
Popular approaches of opinion-based recommender system utilize various techniques including text mining, information retrieval, sentiment analysis (see also Multimodal sentiment analysis) and deep learning .
Instead of developing recommendation techniques based on a single criterion values, the overall preference of user u for the item i, these systems try to predict a rating for unexplored items of u by exploiting preference information on multiple criteria that affect this overall preference value.
The majority of existing approaches to recommender systems focus on recommending the most relevant content to users using contextual information, yet do not take into account the risk of disturbing the user with unwanted notifications.
It is important to consider the risk of upsetting the user by pushing recommendations in certain circumstances, for instance, during a professional meeting, early morning, or late at night.
Additionally, mobile recommender systems suffer from a transplantation problem – recommendations may not apply in all regions (for instance, it would be unwise to recommend a recipe in an area where all of the ingredients may not be available).
This system uses GPS data of the routes that taxi drivers take while working, which includes location (latitude and longitude), time stamps, and operational status (with or without passengers).
These methods can also be used to overcome some of the common problems in recommender systems such as cold start and the sparsity problem, as well as the knowledge engineering bottleneck in knowledge-based approaches.
The website makes recommendations by comparing the watching and searching habits of similar users (i.e., collaborative filtering) as well as by offering movies that share characteristics with films that a user has rated highly (content-based filtering).
From 2006 to 2009, Netflix sponsored a competition, offering a grand prize of $1,000,000 to the team that could take an offered dataset of over 100 million movie ratings and return recommendations that were 10% more accurate than those offered by the company's existing recommender system.
Although the data sets were anonymized in order to preserve customer privacy, in 2007 two researchers from the University of Texas were able to identify individual users by matching the data sets with film ratings on the Internet Movie Database.
To measure the effectiveness of recommender systems, and compare different approaches, three types of evaluations are available: user studies, online evaluations (A/B tests), and offline evaluations.
In A/B tests, recommendations are shown to typically thousands of users of a real product, and the recommender system randomly picks at least two different recommendation approaches to generate recommendations.
A systematic analysis of publications applying deep learning or neural methods to the top-k recommendation problem, published in top conferences (SIGIR, KDD, WWW, RecSys), has shown that on average less than 40% of articles are reproducible, with as little as 14% in some conferences.
Konstan and Adomavicius conclude that 'the Recommender Systems research community is facing a crisis where a significant number of papers present results that contribute little to collective knowledge […] often because the research lacks the […] evaluation to be properly judged and, hence, to provide meaningful contributions.'
Bellogín conducted a study of papers published in the field, as well as benchmarked some of the most popular frameworks for recommendation and found large inconsistencies in results, even when the same algorithms and data sets were used.
'(1) survey other research fields and learn from them, (2) find a common understanding of reproducibility, (3) identify and understand the determinants that affect reproducibility, (4) conduct more comprehensive experiments (5) modernize publication practices, (6) foster the development and use of recommendation frameworks, and (7) establish best-practice guidelines for recommender-systems research.'
Zhao, Truss-based community search: A truss-equivalence based indexing approach, Proceedings of the VLDB Endowment, vol.10, pp.1298-1309, 2017.
Ravana, Identification of influential spreaders in online social networks using interaction weighted k-core decomposition method, Physica A: Statistical Mechanics and its Applications, vol.468, pp.278-288, 2017.
Vespignani, Large scale networks fingerprinting and visualization using the k-core decomposition, NIPS '06: Advances in Neural Information Processing Systems, pp.41-50, 2006.
Vespignani, k-core decomposition: a tool for the analysis of large scale internet graphs, 2005.
Vespignani, k-core decomposition of internet graphs: Hierarchies, self-similarity and measurement biases, NHM, vol.3, issue.2, p.371, 2008.URL : https://hal.archives-ouvertes.fr/hal-00012974
Velegrakis, Distributed k-core decomposition and maintenance in large dynamic graphs, Proceedings of the 10th ACM International Conference on Distributed and Event-based Systems, DEBS '16, pp.161-168, 2016.
Jacomy, Gephi: an open source software for exploring and manipulating networks, Icwsm, vol.8, pp.361-362, 2009.
Sharma, Preventing unraveling in social networks: the anchored k-core problem, ICALP '11: Proceedings of the 39th International Colloquium Conference on Automata, Languages, and Programming, pp.440-451, 2011.
Page, The anatomy of a large-scale hypertextual web search engine, Proceedings of the Seventh International Conference on World Wide Web 7, WWW7, pp.107-117, 1998.
Shir, A model of internet topology using k-shell decomposition, vol.104, pp.11150-11154, 2007.
Tsai, Using k-core decomposition to find cluster centers for kmeans algorithm in graphx on spark, he Eighth International Conference on Cloud Computing, GRIDs, and Virtualization, pp.93-98, 2017.
Boguná, Deciphering the global organization of clustering in real complex networks, Scientific reports, vol.3, p.2517, 2013.
Sozio, Large scale density-friendly graph decomposition via convex programming, Proceedings of the 26th International Conference on World Wide Web, WWW '17, pp.233-242, 2017.URL : https://hal.archives-ouvertes.fr/hal-01699048
Malek, K-core decomposition of a protein domain co-occurrence network reveals lower cancer mutation rates for interior cores, Journal of Clinical Bioinformatics, vol.5, issue.1, p.1, 2015.
Bruno, A hierarchical model of metabolic machinery based on the kcore decomposition of plant metabolic networks, PLOS ONE, vol.13, issue.5, pp.1-15, 2018.
Freeman, A set of measures of centrality based on betweenness, Sociometry, vol.40, issue.1, pp.35-41, 1977.
Gullo, Core decomposition and densest subgraph in multilayer networks, Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, CIKM '17, pp.1807-1816, 2017.
Galeano, A structural approach to disentangle the visualization of bipartite biological networks, Complexity, vol.02, pp.1-11, 2018.
Galeano, Ranking of critical species to preserve the functionality of mutualistic networks using the k-core decomposition, PeerJ, vol.5, issue.e3321, 2017.
Vazirgiannis, Visual exploration of collaboration networks based on graph degeneracy, Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, pp.1512-1515, 2012.
Vazirgiannis, Quantifying trust dynamics in signed graphs, the s-cores approach, Proceedings of the 2014 SIAM International Conference on Data Mining, pp.668-676, 2014.URL : https://hal.archives-ouvertes.fr/lirmm-01083529
Vazirgiannis, Quantifying trust dynamics in signed graphs, the s-cores approach, SDM, pp.668-676, 2014.URL : https://hal.archives-ouvertes.fr/lirmm-01083529
Vazirgiannis, Corecluster: A degeneracy based graph clustering framework, AAAI '14: Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, pp.44-50, 2014.URL : https://hal.archives-ouvertes.fr/lirmm-01083536
Vazirgiannis, D-cores: Measuring collaboration of directed graphs based on degeneracy, ICDM '11: Proceedings of the 11th IEEE International Conference on Data Mining, pp.201-210, 2011.URL : https://hal.archives-ouvertes.fr/lirmm-00846768
Faloutsos, Nimblecore: A space-efficient external memory algorithm for estimating core numbers, ASONAM, pp.207-214, 2016.
Soundarajan, The k-peak decomposition: Mapping the global structure of graphs, Proceedings of the 26th International Conference on World Wide Web, WWW '17, pp.1441-1450, 2017.
Honey et al., Mapping the structural core of human cerebral cortex, PLOS Biology, vol.6, issue.7, p.159, 2008.
Sinha, Analysis of core-periphery organization in protein contact networks reveals groups of structurally and functionally critical residues, Journal of Biosciences, vol.40, issue.4, pp.683-699, 2015.
Bauckhage, The slashdot zoo: Mining a social network with negative edges, Proceedings of the 18th International Conference on World Wide Web, WWW '09, pp.741-750, 2009.
Luca et al., Spectral analysis of signed graphs for clustering, prediction and visualization, SDM, pp.559-570, 2010.
Cohen et al., K-shell decomposition reveals hierarchical cortical organization of the human brain, New Journal of Physics, vol.18, issue.8, p.83013, 2016.
Gao, S-kcore: A social-aware kcore decomposition algorithm in pocket switched networks, 2010 IEEE/IFIP 8th International Conference on Embedded and Ubiquitous Computing (EUC 2010)(EUC), vol.00, p.2010
Mao, Efficient core maintenance in large dynamic graphs, IEEE Transactions on Knowledge and Data Engineering, vol.26, issue.10, pp.2453-2465, 2014.
Liu, Identifying the node spreading influence with largest k-core values, Physics Letters A, vol.378, issue.45, pp.3279-3284, 2014.
Stanley, The h-index of a network node and its relation to degree and coreness, Nature Communications, vol.7, p.10168, 2016.
Horn et al., Pregel: A system for large-scale graph processing, Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, SIGMOD '10, pp.135-146, 2010.
Vazirgiannis, To stay or not to stay: modeling engagement dynamics in social graphs, 22nd ACM International Conference on Information and Knowledge Management, CIKM'13, pp.469-478, 2013.
Vazirgiannis, Vulnerability assessment in social networks under cascade-based node departures, Europhysics Letters), vol.110, issue.6, p.68006, 2015.
Hasan, A distributed k-core decomposition algorithm on spark, Proceedings of the 2017 IEEE International Conference on Big Data (Big Data), BIG DATA '17, pp.976-981, 2017.
Bhowmick, Identifying important classes of large software systems through k-core decomposition, Advances in Complex Systems, vol.17, p.1550004, 2015.
Tarau, Textrank: Bringing order into text, Proceedings of the 2004 conference on empirical methods in natural language processing, 2004.
Vespignani, Epidemic spreading in scale-free networks, Physical review letters, vol.86, issue.14, p.3200, 2001.DOI : 10.1515/9781400841356.493URL : https://repository.library.northeastern.edu/files/neu:331357/fulltext.pdf
Tassiulas, Mapreduce-based distributed k-shell decomposition for online social networks, IEEE World Congress on Services, vol.0, pp.30-37, 2014.DOI : 10.1109/services.2014.16
Qin, Efficient probabilistic k-core computation on uncertain graphs, 2018 IEEE 34th International Conference on Data Engineering (ICDE), pp.1192-1203, 2018.DOI : 10.1109/icde.2018.00110
Qi, Hybrid virtual network embedding with k-core decomposition and time-oriented priority, ICC, pp.2695-2699, 2012.DOI : 10.1109/icc.2012.6363761
Vazirgiannis, Main core retention on graph-of-words for single-document keyword extraction, ECIR '15: Proceedings of the 37th European Conference on Information Retrieval, pp.382-393, 2015.
Nowotny, Influence of wiring cost on the large-scale architecture of human cortical connectivity, PLOS Computational Biology, vol.10, issue.4, pp.1-24, 2014.
Catalyurek, Finding the hierarchy of dense subgraphs using nucleus decompositions, Proceedings of the 24th International Conference on World Wide Web, WWW '15, pp.927-937, 2015.
Gntrkn, Large-scale network organization in the avian forebrain: a connectivity matrix and theoretical analysis, Frontiers in Computational Neuroscience, vol.7, p.89, 2013.
Langford, A global geometric framework for nonlinear dimensionality reduction, science, vol.290, issue.5500, pp.2319-2323, 2000.
Faloutsos, Doulion: counting triangles in massive graphs with a coin, Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pp.837-846, 2009.
Xie et al., Parallel algorithm for core maintenance in dynamic graphs, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS), vol.00, pp.2366-2371, 2017.
Lin, Finding critical users for social network engagement: The collapsed k-core problem, Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, pp.245-251, 2017.
Lin, When engagement meets similarity: Efficient (k,r)-core computation on social networks, Proceedings of the VLDB Endowment, vol.10, pp.998-1009, 2017.DOI : 10.14778/3115404.3115406
Parthasarathy, Extracting analyzing and visualizing triangle k-core motifs within networks, ICDE '12: Proceedings of the 2012 IEEE 28th International Conference on Data Engineering, pp.1049-1060, 2012.
Qiang, Analysis of the spreading influence of the nodes with minimum k-shell value in complex networks, Acta Physica Sinica, vol.62, issue.10, p.108902, 2013.
designed as hardware acceleration for artificial intelligence applications, especially artificial neural networks, machine vision and machine learning.
Notable application-specific hardware units include video cards for graphics, sound cards, graphics processing units and digital signal processors.
As deep learning and artificial intelligence workloads rose in prominence in the 2010s, specialized hardware units were developed or adapted from existing products to accelerate these tasks.
In the 1990s, there were also attempts to create parallel high-throughput systems for workstations aimed at various applications, including neural network simulations.
Heterogeneous computing refers to incorporating a number of specialized processors in a single system, or even a single chip, each optimized for a specific type of task.
have features significantly overlapping with AI accelerators including: support for packed low precision arithmetic, dataflow architecture, and prioritizing 'throughput' over latency.
The mathematical basis of neural networks and image manipulation are similar, embarrassingly parallel tasks involving matrices, leading GPUs to become increasingly used for machine learning tasks.
In June 2017, IBM researchers announced an architecture in contrast to the Von Neumann architecture based on in-memory computing and phase-change memory arrays applied to temporal correlation detection, intending to generalize the approach to heterogeneous computing and massively parallel systems.
In October 2018, IBM researchers announced an architecture based on in-memory processing and modeled on the human brain's synaptic network to accelerate deep neural networks.
As of 2016, the field is still in flux and vendors are pushing their own marketing term for what amounts to an 'AI accelerator', in the hope that their designs and APIs will become the dominant design.