Data science: Identifying influencers in social networks

Srikanth Bethu, V. Sowmya, B. Sankara Babu, G. Charles Babu, Y. Jeevan Nagendra Kumar


Data science is a "concept to unify statistics, data analysis and their related methods" in order to "understand and analyze actual phenomena" with data. The common use of Online Social Networks (OSN)[2] for networking communication which authorizes real-time multimedia capturing and sharing, have led to enormous amounts of user-generated content in online, and made publicly available for analysis and mining. The efforts have been made for more privacy awareness to protect personal data against privacy threats. The principal idea in designing different marketing strategies is to identify the influencers in the network communication. The individuals influential induce “word-of-mouth” that effects in the network are responsible for causing particular action of influence that convinces their peers (followers) to perform a similar action in buying a product. Targeting these influencers usually leads to a vast spread of the information across the network. Hence it is important to identify such individuals in a network, we use centrality measures to identify assign an influence score to each user. The user with higher score is considered as a better influencer.

Full Text:



G “ Python for Informatics: Exploring Information” – Book by Charles Severance

“Practical Data Science Cookbook” – Book by Abhijit Dasgupta, Benjamin Bengfort, Sean Patrick Murphy, and Tony Ojeda.

Stanford WebBase Project. http://www-diglib.

L. A. Adamic. The Small World Web. In Proceedings of the Third European Conference on Research and Advanced Technology for Digital Libraries (ECDL’99), Paris, France, Sep 1999.

L. A. Adamic, O. Buyukkokten, and E. Adar. A social network caught in the Web. First Monday, 8(6), 2003.

Y.-Y. Ahn, S. Han, H. Kwak, S. Moon, and H. Jeong. Analysis of Topological Characteristics of Huge Online Social Networking Services. In Proceedings of the 16th international conference on World Wide Web (WWW’07), Banff, Canada, May 2007.

R. Albert, H. Jeong, and A.-L. B´arab´asi. The Diameter of the World Wide Web. Nature, 401:130, 1999.

L. A. N. Amaral, A. Scala, M. Barth´el´emy, and H. E. Stanley. Classes of small-world networks. Proceedings of the National Academy of Sciences (PNAS), 97:11149–11152, 2000.

A. Awan, R. A. Ferreira, S. Jagannathan, and A. Grama. Distributed uniform sampling in real-world networks. Technical Report CSD-TR-04-029, Purdue University, 2004.

L. Backstrom, D. Huttenlocher, J. Kleinberg, and X. Lan. Group Formation in Large Social Networks: Membership, Growth, and Evolution. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’06), Philadelphia, PA, Aug 2006.

A.-L. B´arab´asi and R. Albert. Emergence of Scaling in Random Networks. Science, 286:509–512, 1999.

L. Becchetti, C. Castillo, D. Donato, and A. Fazzone. A Comparison of Sampling Techniques for Web Graph Characterization. In Proceedings of the Workshop on Link Analysis (LinkKDD’06), Philadelphia, PA, Aug 2006.

V. Braitenberg and A. Schuz. ¨ Anatomy of a Cortex: Statistics and Geometry. Springer-Verlag, Berlin, 1991.

A. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan, R. Stata, A. Tomkins, and J. Wiener. Graph Structure in the Web: Experiments and Models. In Proceedings of the 9th International World Wide Web Conference (WWW’00), Amsterdam, May 2000.

A. Clauset, C. R. Shalizi, and M. E. J. Newman. Power-law distributions in empirical data, Jun 2007.

d. boyd. Friends, Friendsters, and Top 8: Writing community into being on social network sites. First Monday, 11(12), 2006.

P. Erd¨os and A. R´enyi. On Random Graphs I. Publicationes Mathematicae Debrecen, 5:290–297, 1959.

M. Faloutsos, P. Faloutsos, and C. Faloutsos. On Power-Law Relationships of the Internet Topology. In Proceedings of the Annual Conference of the ACM Special Interest Group on Data Communication (SIGCOMM’99), Cambridge, MA, Aug 1999.

S. Garriss, M. Kaminsky, M. J. Freedman, B. Karp, D. Mazi`eres, and H. Yu. Re: Reliable Email. In Proceedings of the 3rd Symposium on Networked Systems Design and Implementation (NSDI’06), San Jose, CA, May 2006.

M. Girvan and M. E. J. Newman. Community structure in social and biological networks. Proceedings of the National Academy of Sciences (PNAS), 99:7821–7826, 2002.

Google Co-op. [20] M. Granovetter. The Strength of Weak Ties. American Journal of Sociology, 78(6), 1973.

J. Kleinberg. Authoritative Sources in a Hyperlinked Environment. Journal of the ACM, 46:604–632, 1999.

J. Kleinberg. Navigation in a Small World. Nature, 406:845–845, 2000.

J. Kleinberg. The Small-World Phenomenon: An Algorithmic Perspective. In Proceedings of the 32nd ACM Symposium on Theory of Computing (STOC’00), Portland, OR, May 2000.

J. Kleinberg and S. Lawrence. The Structure of the Web. Science, 294:1849–1850, 2001.

J. M. Kleinberg and R. Rubinfeld. Short paths in expander graphs. In IEEE Symposium on Foundations of Computer Science (FOCS’96), Burlington, VT, Oct 1996.

R. Kumar, J. Novak, and A. Tomkins. Structure and Evolution of Online Social Networks. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’06), Philadelphia, PA, Aug 2006.

R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins. Trawling the Web for Emerging Cyber-Communities. Computer Networks, 31:1481–1493, 1999.

S. Lee, R. Sherwood, and B. Bhattacharjee. Cooperative peer groups in NICE. In Proceedings of the Conference on Computer Communications (INFOCOM’03), San Francisco, CA, Mar 2003.

S. H. Lee, P.-J. Kim, and H. Jeong. Statistical properties of sampled networks. Physical Review E, 73, 2006.

L. Li and D. Alderson. Diversity of graphs with highly variable connectivity. Physics Review E, 75, 2007.

L. Li, D. Alderson, J. C. Doyle, and W. Willinger. Towards a Theory of Scale-Free Graphs: Definitions, Properties, and Implications. Internet Mathematics, 2(4):431–523, 2006.

D. Liben-Nowell, J. Novak, R. Kumar, P. Raghavan, and A. Tomkins. Geographic Routing in Social Networks. Proceedings of the National Academy of Sciences (PNAS), 102(33):11623–11628, 2005.

P. Mahadevan, D. Krioukov, K. Fall, and A. Vahdat. Systematic Topology Analysis and Generation Using Degree Correlations. In Proceedings of the Annual Conference of the ACM Special Interest Group on Data Communication (SIGCOMM’06), Pisa, Italy, August 2006.

S. Milgram. The small world problem. Psychology Today, 2(60), 1967.

A. Mislove, K. P. Gummadi, and P. Druschel. Exploiting social networks for Internet search. In Proceedings of the 5th Workshop on Hot Topics in Networks (HotNets-V), Irvine, CA, Nov 2006.

M. Molloy and B. Reed. A critical point for random graphs with a given degree distribution. Random Structures and Algorithms, 6, 1995.

M. Molloy and B. Reed. The size of the giant component of a random graph with a given degree sequence. Combinatorics, Probability and Computing, 7, 1998.

R. Morselli, B. Bhattacharjee, J. Katz, and M. A. Marsh. Keychains: A Decentralized Public-Key Infrastructure. Technical Report CS-TR-4788, University of Maryland, 2006.


MySpace is the number one website in the U.S. according to Hitwise. HitWise Press Release, July, 11, 2006. hitwiseHS2004/social-networking-june-2006.php.

M. E. J. Newman. The structure of scientific collaboration networks. Proceedings of the National Academy of Sciences (PNAS), 98:409–415, 2001. [42] M. E. J. Newman. Mixing patterns in networks. Physics Review E, 67, 2003.



  • There are currently no refbacks.

Copyright (c) 2019 Srikanth S

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

ISSN: 2303-4521

Digital Object Identifier DOI: 10.21533/pen

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License