I am an Assistant Professor at CSE, College of Computing at Georgia Institute of Technology. My research expertise lies in developing data-driven methods to enhance web and social media integrity. I build graphs, content, and adversarial learning methods, while utilizing terabytes of data from multiple online platforms spanning multiple modalities and languages. I innovate scalable and efficient methods for detection, prediction, attribution, and mitigation of the biggest online threats, namely malicious actors (e.g., ban evaders, sockpuppets, coordinated campaigns, fraudsters) and dangerous content (e.g., misinformation, hate speech, fake reviews). The models I have developed have been used at Flipkart (India's largest e-commerce platform, acquired by Walmart), has influenced Twitter's Birdwatch platform (community-driven misinformation detection platform), and is being deployed on Wikipedia.

Prior to Georgia Tech, I was a visiting researcher at Google AI, a postdoctoral researcher at Stanford University, and a PhD student at the University of Maryland. I am honored to be recognized as a Kavli Fellow (by the National Academy of Sciences), Forbes 30 under 30, CRA Computing Innovation Mentor, Facebook Faculty Research Awardee, Adobe Faculty Research Awardee, Class of 1969 Teaching Fellow, ACM SIGKDD Doctoral Dissertation Award 2018 runner-up, WWW 2017 Best Paper Award runner-up, Larry S. Davis Doctoral Dissertation Award 2017, and Dr. B.C. Roy Gold Medal. My work has been covered in a documentary (Familiar Shapes), in a radio interview (WABE), and by popular press, including Wired, CNN, Wall Street Journal, Tech Crunch, New York Magazine, and more.

Recent News

Research Interests

Online malicious actors and dangerous content threaten public health, democracy, science, and society. To combat these threats, I build technological solutions, including accurate and robust models for early identification, prediction and attibution, as well as social mitigation solutions, such as empowering people to counter online harms. I have conducted the largest study of malicious sockpuppetry across nine platforms, ban evasion/recidivism on online platforms, and some of the earliest works on online misinformation. I am the one of the first to investigate of the reliability of web safety models used in practice, including Facebook's TIES and Twitter's Birdwatch. My work is one of the first to study whole-of-society solutions to mitigate online misinformation.

My research interests lie in comprehensively studying some of the biggest threats to Web Safety and Integrity from complementary angles:

  • "Multi-X" Misinformation and Malicious Actors: Multi-Platform, Multi-Modal, and Multi-Lingual
  • Enhancing the Adversarial Robustness and Trustworthiness of Web Models
  • Building Graphs and Networks Models for Accurate and Early Detection
  • Studying Recommender Systems' Impact and Building Responsible Recommender Systems
  • Developing Tech-Powered Social Solutions to Combat Online Harms

In detail, my research interests spans the following topics:

(1) Platform Safety and Integrity: I develop methods to efficiently characterize the behavior of and detect both harmful content and malicious actors. Accurate characterization and early detection can greatly improve the safety, integrity, and well-being of online users, communities, and platforms. I have worked on the following type of bad behavior:

(2) Robustness of Web Models: Machine learnind and deep learning models are being used for high-stakes tasks. However, their trustworthiness, reliability, and robustness against manipulation by smart adversaries and to unintentional changes in data is not known. I have explored how adversaries can manipulate recommender systems for their gains. I have conducted the first investigate to quantify the trustworthiness of Facebook's TIES deep learning-based fraud detection models [ACM SIGKDD 2021], recommender systems [ACM CIKM 2022], graph-based models [ACM CIKM 2021b], and community-driven counter misinformation platform used at Twitter's Birdwatch [ASONAM 2021b].

(3) Graphs and Networks: Modeling and predicting over large-scale networks is crucial to mine actionable insights from large inter-connect data, including social networks, e-commerce networks, knowledge graphs, spatio-temporal networks, and interaction networks. My relevant works include:

(4) Recommender systems and Behavior Modeling: Recommender systems power much of the content and products that we see online. I develop user-based and graph-based efficient recommender systems that are accurate, scalable, and trustworthy [ACM CIKM 2021, ACM SIGKDD 2019]. I also investigate how malicious actors can manipulate deep learning-powered recommender systems for their ulterior motives. I create new techniques to quantify this robustness and innovate new adversarially-robust deep recommender system architectures, to usher an era of trustworthy recommendations. Relevant works include:

Publications

For my complete list of publications, please refer to my Google Scholar profile.

Highlights (selected from the full list below)

  • Examining the impact of sharing COVID-19 misinformation online on mental health [PDF] NEW!
    Gaurav Verma, Ankur Bhardwaj, Talayeh Aledavood, Munmun De Choudhury, Srijan Kumar
    Scientific Reports – Scientific Reports 12, 8045 (2022)
  • Cross-Platform Multimodal Misinformation: Taxonomy, Characteristics and Detection for Textual Posts and Videos NEW!
    Nicholas Micallef, Marcelo Sandoval-Castaneda, Mustaque Ahamad, Adi Cohen, Srijan Kumar, Nasir Memon
    AAAI ICWSM 2022 – The AAAI 16th International Conference on Web and Social Media
  • Characterizing, Detecting, and Predicting Online Ban Evasion. [PDF] NEW!
    Manoj Niverthi, Gaurav Verma, Srijan Kumar
    ACM WWW 2022 – The ACM Web Conference, 2022
    [Project page with data and code]
  • Overcoming Language Disparity in Online Content Classification with Multimodal Learning NEW!
    Gaurav Verma, Rohit Mujumdar, Jay Wang, Munmun De Choudhury, Srijan Kumar
    AAAI ICWSM 2022 – The AAAI 16th International Conference on Web and Social Media
    [Project page with data and code]
  • Rank List Sensitivity of Recommender Systems to Interaction Perturbations NEW!
    Sejoon Oh, Berk Ustun, Julian McAuley, Srijan Kumar
    ACM CIKM 2022 – 31st ACM International Conference on Information and Knowledge Management
  • PETGEN: Personalized Text Generation Attack on Deep Sequence Embedding-based Classification Models. [PDF] NEW!
    Bing He, Mustaque Ahamad, Srijan Kumar
    ACM SIGKDD 2021 – 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021
    [Project page with data and code] [Link to presentation (pptx)] [Link to presentation (pdf)]
  • Influence-guided Data Augmentation for Neural Tensor Completion. [PDF] NEW!
    Sejoon Oh, Sungchul Kim, Ryan Rossi, Srijan Kumar
    ACM CIKM 2021 – 30th ACM International Conference on Information and Knowledge Management, 2021
    [Project page with data and code]
  • Racism is a Virus: Anti-Asian Hate and Counterspeech in Social Media during the COVID-19 Crisis NEW!
    Bing He, Caleb Ziems, Sandeep Soni, Naren Ramakrishnan, Diyi Yang, Srijan Kumar
    IEEE/ACM ASONAM 2021 – The 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
    [Project page with data and code]
  • HawkEye: A Robust Reputation System for Community-based Misinformation Detection. [PDF] NEW!
    Rohit Mujumdar, Srijan Kumar
    IEEE/ACM ASONAM 2021 – The 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
  • The Role of the Crowd in Countering Misinformation: A Case Study of the COVID-19 Infodemic. [PDF]
    Nicholas Micallef*, Bing He*, Srijan Kumar, Mustaque Ahamad, Nasir Memon (* = equal contribution)
    IEEE Big Data 2020 -- Full Paper, research track (top 15%)
    [Project page with data and code]
  • Predicting Dynamic Embedding Trajectory in Temporal Interaction Networks [PDF]
    Srijan Kumar, Xikun Zhang, Jure Leskovec
    ACM SIGKDD, 2019 – 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2019

  • List of all publications

    1. Examining the impact of sharing COVID-19 misinformation online on mental health [PDF] NEW!
      Gaurav Verma, Ankur Bhardwaj, Talayeh Aledavood, Munmun De Choudhury, Srijan Kumar
      Scientific Reports – Scientific Reports 12, 8045 (2022)
    2. Rank List Sensitivity of Recommender Systems to Interaction Perturbations NEW!
      Sejoon Oh, Berk Ustun, Julian McAuley, Srijan Kumar
      ACM CIKM 2022 – 31st ACM International Conference on Information and Knowledge Management
    3. Implicit Session Contexts for Next-Item Recommendations NEW!
      Sejoon Oh, Ankur Bharadwaj, Jongseok Han, Sungchul Kim, Ryan Rossi, Srijan Kumar
      ACM CIKM 2022 – 31st ACM International Conference on Information and Knowledge Management - short paper
    4. Overcoming Language Disparity in Online Content Classification with Multimodal Learning NEW!
      Gaurav Verma, Rohit Mujumdar, Jay Wang, Munmun De Choudhury, Srijan Kumar
      AAAI ICWSM 2022 – The AAAI 16th International Conference on Web and Social Media
      [Project page with data and code]
    5. Cross-Platform Multimodal Misinformation: Taxonomy, Characteristics and Detection for Textual Posts and Videos NEW!
      Nicholas Micallef, Marcelo Sandoval-Castaneda, Mustaque Ahamad, Adi Cohen, Srijan Kumar, Nasir Memon
      AAAI ICWSM 2022 – The AAAI 16th International Conference on Web and Social Media
    6. Characterizing, Detecting, and Predicting Online Ban Evasion. [PDF] NEW!
      Manoj Niverthi*, Gaurav Verma*, Srijan Kumar
      ACM WWW 2022 – The ACM Web Conference, 2022
      [Project page with data and code]
    7. M2TRec: Metadata-aware Multi-task Transformer for Large-scale and Cold-start free Session-based Recommendations
      Walid Shalaby, Sejoon Oh, Amir Afsharinejad, Xiquan Cui, Srijan Kumar
      ACM RecSys 2022 – The ACM Conference Series on Recommender Systems (LBR), 2022
    8. M2P2: Multimodal Persuasion Prediction using Adaptive Fusion [PDF]
      Chongyang Bai, Haipeng Chen, Srijan Kumar, Jure Leskovec, V.S. Subrahmanian
      IEEE TMM 2021 – IEEE Transactions on Multimedia NEW!
      [Project page with data and code]
    9. PETGEN: Personalized Text Generation Attack on Deep Sequence Embedding-based Classification Models. [PDF] NEW!
      Bing He, Mustaque Ahamad, Srijan Kumar
      ACM SIGKDD 2021 – 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021
      [Project page with data and code] [Link to presentation (pptx)] [Link to presentation (pdf)]
    10. Influence-guided Data Augmentation for Neural Tensor Completion. [PDF] NEW!
      Sejoon Oh, Sungchul Kim, Ryan Rossi, Srijan Kumar
      ACM CIKM 2021 – 30th ACM International Conference on Information and Knowledge Management, 2021
      [Project page with data and code]
    11. Racism is a Virus: Anti-Asian Hate and Counterspeech in Social Media during the COVID-19 Crisis NEW!
      Bing He, Caleb Ziems, Sandeep Soni, Naren Ramakrishnan, Diyi Yang, Srijan Kumar
      IEEE/ACM ASONAM 2021 – The 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
      [Project page with data and code]
    12. HawkEye: A Robust Reputation System for Community-based Misinformation Detection. [PDF] NEW!
      Rohit Mujumdar, Srijan Kumar
      IEEE/ACM ASONAM 2021 – The 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
    13. Evaluating Graph Vulnerability and Robustness using TIGER. NEW!
      Scott Freitas, Diyi Yang, Srijan Kumar, Hanghang Tong, Polo Chau
      ACM CIKM 2021 – 30th ACM International Conference on Information and Knowledge Management, 2021
      [Project page]
    14. Deception Detection in Group Video Conversations using Dynamic Interaction Networks. [PDF]
      Srijan Kumar, Chongyang Bai, VS Subrahmanian, Jure Leskovec
      AAAI ICWSM 2021 – 15th International AAAI Conference on Web and Social Media, 2021
      [Project page with data]
    15. The Role of the Crowd in Countering Misinformation: A Case Study of the COVID-19 Infodemic. [PDF]
      Nicholas Micallef*, Bing He*, Srijan Kumar, Mustaque Ahamad, Nasir Memon (* = equal contribution)
      IEEE Big Data 2020 -- Full paper in research track
      [Project page with data and code]
    16. Higher-Order Label Homogeneity and Spreading in Graphs. [PDF]
      Dhivya Eswaran, Srijan Kumar, Christos Faloutsos
      ACM Web (WWW), 2020 – The ACM Web Conference, 2020
      [Github page with code and data]
    17. User Engagement with Digital Deception
      Maria Glenski, Svitlana Volkova, Srijan Kumar
      Peer-reviewed book chapter in 'Disinformation, Misinformation, and Fake News in Social Media 2020' by Springer.
    18. Predicting Dynamic Embedding Trajectory in Temporal Interaction Networks [PDF]
      Srijan Kumar, Xikun Zhang, Jure Leskovec
      ACM SIGKDD, 2019 – 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2019 [Oral presentation, research track (top 9%)]
      New dataset released: Account blocks on Wikipedia and Reddit. Link below.
      [Github page with code and data] [Slides] [Short explanation video]

      Included in the curriculum at: UCSD, Purdue University, LMU Munchen.

    19. Predicting the Visual Focus of Attention Prediction in Multi-person Discussion Videos [PDF]
      Chongyang Bai, Srijan Kumar, Jure Leskovec, Miriam Metzger, Jay Nunamaker, V.S. Subrahmanian
      IJCAI, 2019 – International Joint Conference on Artificial Intelligence, 2019
      New dataset released: 62 dynamic networks of who-interacts-with-whom. Link below.
      [Dataset] [Project page with demo]
    20. Predicting Dominance in Multi-person Videos [PDF]
      Chongyang Bai, Maksim Bolonkin, Srijan Kumar, Jure Leskovec, Judee Burgoon, Norah Dunbar, V.S. Subrahmanian
      IJCAI, 2019 – International Joint Conference on Artificial Intelligence, 2019
      [Dataset] [Project page with demo]
    21. Community Interaction and Conflict on the Web [PDF]
      Srijan Kumar, William L. Hamilton, Jure Leskovec, Dan Jurafsky
      ACM Web (WWW), 2018 – The ACM Web Conference, 2018
      New dataset released: Reddit community-to-community interlinks and harassment attacks
      [Project page: Data and Code] [Presentation slides (pptx)] [Presentation slides (pdf)]

      Included in the curriculum at: University of Waterloo

      Press: Russian spam accounts are still a big problem for Reddit (Engadget), What Reddit Tells Us About Political Coalitions and Conflicts (The Atlantic), Most Reddit battles are started by 1 percent of communities (Engadget), Tiny percent of Reddit communities spark majority of conflicts (CNET), One Percent of Subreddits Are Responsible for Most of the Raids on Reddit (VICE), and more by Inverse, TheNextWeb, theregister.co.uk

    22. Rev2: Fraudulent User Prediction in Rating Platforms [PDF]
      Srijan Kumar, Bryan Hooi, Disha Makhija, Mohit Kumar, Christos Faloutsos, V.S. Subrahmanian
      ACM WSDM, 2018 – 11th ACM International Web Search and Data Mining Conference, 2018
      New dataset released: Fraudsters on Amazon, Bitcoin networks, and Epinions.
      [Project page: Data and Codes] [Presentation slides (pptx)] [Poster]

      Included in the curriculum at: Stanford University

    23. False Information on Web and Social Media: A Survey [PDF]
      Srijan Kumar, Neil Shah
      Invited book chapter in Social Media Analytics: Advances and Applications, CRC Press, 2018
    24. Breaking Bad: Forecasting Adversarial Android Bad Behavior
      S. Li*, Srijan Kumar*, Tudor Dumitras, and V.S. Subrahmanian. (* indicates equal contribution).
      CyberSecurity, 2018 - From Database to Cybersecurity, 2018.
    25. Measuring the Evolution of a Scientific Field through Citation Frames. [PDF]
      David Jurgens, Srijan Kumar, Raine Hoover, Dan McFarland, Dan Jurafsky
      TACL, 2018 – Transactions of the Association for Computational Linguistics, 2018
      [Project page with data] [Code]
    26. Demand-Driven Single- and Multitarget Mixture Preparation Using Digital Microfluidic Biochips.
      Shalu, Srijan Kumar, A. Singla, Sudip Roy, K. Chakrabarty, P. P. Chakrabarti, and B. B. Bhattacharya.
      TODAES, 2018 - ACM Transactions on Design Automation of Electronic Systems.
    27. An Army of Me: Sockpuppets in Online Discussion Communities. [PDF]
      Srijan Kumar, Justin Cheng, Jure Leskovec, V.S. Subrahmanian.
      ACM Web (WWW), 2017 – 26th International World Wide Web (The ACM Web) Conference, 2017

      Best Paper Award Honorable Mention


      [Presentation Slides]

      Included in the curriculum at: University of Michigan, Virginia Tech University, Stanford University, Penn State University, Saarland University, and University of Freiburg.

      Documentary: Familiar Shapes by Heather D. Freeman

      Press: Sock puppet accounts unmasked by the way they write and post (New Scientist), Tool unmasks online puppeteers (New Scientist, print version), Spotting sockpuppets with science (TechCrunch), Sock Puppet Accounts on the Internet Getting You Down? Here’s How to Spot Them (WOWscience)

    28. Predicting Human Behavior: The Next Frontiers.
      V.S. Subrahmanian, Srijan Kumar.
      Science, 2017 – Science, vol. 355, issue 6324, pp. 489, 2017
    29. Spectral Lens: Explainable Diagnostics, Tools and Discoveries in Directed, Weighted Graphs
      Sebastian Goebl, Srijan Kumar, Christos Faloutsos
      IEEE ICDM, 2017 – IEEE International Conference on Data Mining, 2017
    30. Data-Driven Approaches towards Malicious Behavior Modeling
      Meng Jiang, Srijan Kumar, V.S. Subrahmanian, Christos Faloutsos
      ACM SIGKDD, 2017 (Tutorial) – Tutorial at 23rd ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2017
    31. Antisocial Behavior on the Web: Characterization and Detection
      Srijan Kumar, Justin Cheng, Jure Leskovec
      ACM Web (WWW), 2017 (Tutorial) – 26th International World Wide Web (The ACM Web) Conference, 2017
    32. Edge Weight Prediction in Weighted Signed Networks. [PDF]
      Srijan Kumar, Francesca Spezzano, V.S. Subrahmanian, Christos Faloutsos
      IEEE ICDM, 2016 – IEEE International Conference on Data Mining, 2016

      Top 10 most cited papers of ICDM in the last 5 years. [Link]


      New datasets released: Weighted, signed, temporal networks from Bitcoin OTC and Bitcoin Alpha.
      [Project page: Code] [Bitcoin OTC] [Bitcoin Alpha]
    33. Disinformation on the Web: Impact, Characteristics and Detection of Wikipedia Hoaxes. [PDF]
      Srijan Kumar, Robert West, Jure Leskovec.
      ACM Web (WWW), 2016 – 25th International World Wide Web (The ACM Web) Conference, 2016
      New datasets released: Hoax articles on Wikipedia
      [Project page: Data and Code]

      Included in the curriculum at: UIUC, University of Waterloo, McGill University, Texas A&M University, University of Hawaii, University of Freiburg, Leibniz University, Hannover, University of Waterloo, University of Alberta, University of Wellington, New Zealand, and Bari BigData winter school 2017.

      Press: Don't Ask Wikipedia To Cure the Internet (WIRED), Can Wikipedia Solve YouTube's Conspiracy Theory Problem? (Motherboard)

    34. Structure and Dynamics of Signed Citation Networks. [PDF]
      Srijan Kumar.
      ACM Web (WWW), 2016 companion - 25th International World Wide Web (The ACM Web) Conference companion, 2016.
      [Project page: Data and Code]
    35. Identifying Malicious Actors on Social Media
      Srijan Kumar, Francesca Spezzano, V.S. Subrahmanian
      IEEE/ACM ASONAM, 2016 tutorial – Advances in Social Network Analysis and Mining, 2016
    36. Stubborn Mining: Generalizing Selfish Mining and Combining with an Eclipse Attack. [PDF]
      Kartik Nayak*, Srijan Kumar*, Andrew Miller and Elaine Shi. (* indicates equal contribution).
      Euro S&P, 2016 - IEEE European Symposium on Security and Privacy, 2016.
    37. VEWS: A Wikipedia Vandal Early Warning System. [PDF]
      Srijan Kumar, Francesca Spezzano, V. S. Subrahmanian.
      ACM SIGKDD, 2015 – 21th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2015
      New datasets released: Vandals and vandalism on Wikipedia
      [Project page: Data and Code]
    38. Linguisitic Harbingers of Betrayal: A Case Study on an Online Strategic Game. [PDF]
      Vlad Niculae, Srijan Kumar, Jordan Boyd-Graber, Cristian Danescu-Niculescu-Mizil.
      ACL, 2015 – 51st Conference of the Association for Computational Linguistics, 2015
      New datasets released: Deception in Diplomacy, a conversation-based online game
      [Project page: Data and Code]

      Press: Should you worry about people who are too polite? (CNN), When Diplomacy Leads to Betrayal (The Wall Street Journal), Here’s a Good Reason to Be Wary of Overly Polite People (New York Magazine) and more here.

    39. Layout-Aware Mixture Preparation of Biochemical Fluids on Application-Specific Digital Microfluidic Biochips. [PDF]
      Sudip Roy, Partha P. Chakrabarti, Srijan Kumar, Krishnendu Chakrabarty, Bhargab B. Bhattacharya.
      ACM TODAES, 2015 - ACM Transactions on Design Automation of Electronic Systems, 2015.
    40. Demand-Driven Mixture Preparation and Droplet Streaming using Digital Microfluidic Biochips. [PDF]
      Sudip Roy, Srijan Kumar, P. P. Chakrabarty, B. B. Bhattacharya and K. Chakrabarty.
      ACM/IEEE DAC, 2014 - ACM/IEEE Design Automation Conference, 2014.
    41. Accurately Detecting Trolls in Slashdot Zoo via Decluttering. [PDF]
      Srijan Kumar, Francesca Spezzano, V. S. Subrahmanian.
      IEEE/ACM ASONAM, 2014 – Advances in Social Network Analysis and Mining, 2014
      New datasets released: Trolls in Slashdot, a signed social networking platform
      [Project page: Data and Code]
    42. Automatic Classification and Analysis of Interdisciplinary fields in Computer science. [PDF]
      T. Chakrabarty, Srijan Kumar, D. Reddy, Suhansanu Kumar, Niloy Ganguly and Animesh Mukherjee.
      ASE/IEEE SocialCom, 2013 - ASE/IEEE International Conference on Social Computing.
    43. Routing-Aware Resource Allocation for Mixture Preparation in Digital Microfluidic Biochips. [PDF]
      Sudip Roy, P. P. Chakrabarty, Srijan Kumar, B. B. Bhattacharya and K. Chakrabarty.
      ISVLSI, 2013 - IEEE International Symposium on VLSI, 2013.
    44. Efficient Mixture Preparation of Biochemical Fluids using Digital Microfluidic Biochip. [PDF]
      Srijan Kumar, Sudip Roy, P. P. Chakrabarti, B. B. Bhattacharya and K. Chakrabarty.
      IEEE DDECS, 2013 - Sixteenth IEEE Symposium on Design and Diagnostics of Electronic Circuits and Systems, 2013.

    For all publications, please see my Google Scholar.

    Group

    The CLAWS - Computational Data Science Lab for the Web and Social Media - at Georgia Tech develops data science and applied machine learning solutions to solve the most pressing challenges facing the users, communities, and platforms on web and social media. We focus on pertinent online threats of malicious actors and dangerous content. We investigate the social and technological factors behind these issues and innovate multi-pronged solutions to overcome these challenges.

    Sponsors: We are grateful for grants and gifts from NSF, DARPA, IDEaS, The Home Depot, Adobe, Facebook, and Microsoft.

    Postdocs:

    • Yeon-Chang Lee: graphs, recommender systems (visiting postdoc)

    Ph.D. Students:

    • Sejoon Oh: adversarial ML, recommender systems; ML@GT Fellow, Twitch PhD Fellowship finalist, Kwanjeong Educational Foundation Fellow
    • Bing He: misinformation (co-advised with Prof. Mustaque Ahamad)
    • Gaurav Verma: misinformation, multimodality; Adobe PhD Fellowship finalist (2022), College of Computing Rising Star Doctoral Student Research Awardee (2022)
    • Kartik Sharma: graphs and networks

    Masters and Undergraduate Students:

    • Andy Chung: BS; NSF CS4Grad fellowship awardee
    • Eric Ma: MS
    • Jongseok Han: MS
    • Harshal Gajjar: MS
    • Aaron Reich: MS
    • Ananya Malik: MS
    • Shivaen Ramshetty: MS
    • Sarath Nookala: MS
    • Ethan Kim: BS
    • Nathan Subrahmanian: BS

    Alumni:

    • Kritika Gupta: MS
    • Sivagami Nambi: MS 2022 -> Amazon
    • Mohit Raghavendra: BS 2022 -> MS at Georgia Tech
    • Soyoung Oh: MS 2022 -> Ph.D. at EPFL
    • Adhira Choudhury: HS 2022 -> BS at UC Berkeley
    • Zhen Jiang: BS 2022 -> MS at UC Berkeley
    • Manoj Niverthi: BS 2022
    • Rohit Sridhar: MS 2022 -> Ph.D. at Georgia Tech
    • Vivek Anand: MS 2023
    • Matthew Yang: MS 2021
    • Bharat Mamidibathula: MS 2021 -> Pinterest
    • Shreeshaa Kulkarni: MS 2021 -> Facebook
    • Ankur Bhardwaj: MS 2021 -> Walmart
    • Rohit Mujumdar: MS 2021 -> NCR
    • Andrew Wang: BS
    • Zan Huang: MS 2020 -> Kuaishou
    • Sunny Dhamnani: MS -> Facebook

    Datasets

    Web and social media safety and integrity datasets:

    1. COVID misinformation tweets dataset.
    2. COVID misinformation dataset containing misinformation from Twitter, Facebook, Reddit, and YouTube
    3. Ban Evasion dataset from Wikipedia, containing ground truth parent-child pairs of malicious accounts that evade ban.
    4. Bitcoin OTC platform rating network, with ground truth of fraudulent accounts.
    5. Bitcoin Alpha platform rating network, with ground truth of fraudulent accounts.
    6. Amazon network, with fake reviewers.
    7. Epinions network, with fake reviewers.
    8. Wikipedia edits, with ground truth of blocked accounts.
    9. Reddit posting data, with ground truth of blocked accounts for misbehavior.
    10. Wikipedia articles, with ground truth hoax articles on Wikipedia.

    Other web and social media datasets:

    1. The COVID-HATE dataset contains over 30 million tweets related to anti-Asian online hate and counterhate speech.
    2. Multi-lingual crisis dataset with content in English, Spanish, Portugese, French, Chinese, and Hindi.
    3. MOOC platform: Student drop out.
    4. Reddit community: Inter-community (subreddit to subreddit) harassment.
    5. Reddit: User and subreddit embeddings.

    Dynamic network datasets:

    1. Web of Trust Network: Bitcoin OTC platform. Signed, weighted, temporal.
    2. Web of Trust Network: Bitcoin Alpha platform Signed, weighted, temporal.
    3. Reddit: Community-to-community link network. Temporal, weighted.
    4. Wikipedia: User to page edit. Temporal, weighted, attributed.
    5. Reddit: User to subreddit posting activity.Temporal, weighted, attributed.
    6. MOOC platform: Student activity. Temporal.
    7. LastFM: User activity (listening to songs). Temporal.

    Teaching

    • CSE 8803 DSN: Data Science for Social Networks (Fall 2022)
    • CSE 6240: Web Search and Text Mining (Spring 2022)
    • CSE 8803 DSN: Data Science for Social Networks (Fall 2021)
    • CSE 6240: Web Search and Text Mining (Spring 2021)
    • CSE 6240: Web Search and Text Mining (Spring 2020)