Research

Selected Peer-Reviewed Publications

Joshua Ashkinaze, Hua Shen, Sai Avula, Eric Gilbert, and Ceren Budak. "Deep Value Benchmark: Measuring Whether Models Generalize Deep Values or Shallow Preferences" The 39th Annual Conference on Neural Information Processing Systems (NeurIPS) (2025)

Note: Spotlight 🏆 (top 4% of submissions)

We introduce the Deep Value Benchmark (DVB), an evaluation framework that directly tests whether large language models (LLMs) learn fundamental human values or merely surface-level preferences. This distinction is critical for AI alignment: systems that capture deeper values are likely to generalize human intentions robustly, while those that capture only superficial patterns risk misaligned behavior. The DVB uses a novel experimental design with controlled confounding between deep values (e.g., moral principles) and shallow features (e.g., superficial attributes like formality). In the training phase, we expose LLMs to preference data with deliberately correlated deep and shallow features—for instance, where a user consistently prefers (non-maleficence, formal language) over (justice, informal language). The testing phase breaks these correlations, presenting choices between (justice, formal language) and (non-maleficence, informal language). This allows us to measure a model's Deep Value Generalization Rate (DVGR)—the probability of generalizing based on underlying values rather than shallow features. Across 9 models, the average DVGR is just 0.30, meaning all models generalize deep values less than chance. Counterintuitively, larger models exhibit slightly lower DVGR than smaller models. The dataset underwent three separate human validation experiments to ensure reliability. DVB provides an interpretable measure of a core feature of alignment, revealing that current models prioritize shallow preferences over deep values.


Joshua Ashkinaze, Ruijia Guan, Laura Kurek, Eytan Adar, Ceren Budak, and Eric Gilbert. “Seeing Like an AI: How LLMs Apply (and Misapply) Wikipedia Neutrality Norms” The 20th International AAAI Conference on Web and Social Media (ICWSM) (2026, forthcoming)

Note: We were very happy that this research had substantial impact with key stakeholders: I was invited to give a talk at a Wikimedia research showcase, and our paper is cited in Wikipedia strategy around integrating AI into the platform.

Large language models (LLMs) are trained on broad corpora and then used in communities with specialized norms. Is providing LLMs with community rules enough for models to follow these norms? We evaluate LLMs’ capacity to detect (Task 1) and correct (Task 2) biased Wikipedia edits according to Wikipedia’s Neutral Point of View (NPOV) policy. LLMs struggled with bias detection, achieving only 64% accuracy on a balanced dataset. Models exhibited contrasting biases (some under- and others over-predicted bias), suggesting distinct priors about neutrality. LLMs performed better at generation, removing 79% of words removed by Wikipedia editors. However, LLMs made additional changes beyond Wikipedia editors’ simpler neutralizations, resulting in high-recall but low-precision editing. Interestingly, crowdworkers rated AI rewrites as more neutral (70%) and fluent (61%) than Wikipedia-editor rewrites. Qualitative analysis found LLMs sometimes applied NPOV more comprehensively than Wikipedia editors but often made extraneous non-NPOV-related changes (such as grammar). LLMs may apply rules in ways that resonate with the public but diverge from community experts. While potentially effective for generation, LLMs may reduce editor agency and increase moderation workload (e.g., verifying additions). Even when rules are easy to articulate, having LLMs apply them like community members may still be difficult.


Joshua Ashkinaze, Emily Fry, Narendra Edara, Eric Gilbert, and Ceren Budak. "Plurals: A System for Guiding LLMs Via Simulated Social Ensembles". Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems

Note 1: Honorable Mention 🏆 (top 5% of submissions)

Note 2: Check out the Github library!

Recent debates raised concerns that language models may favor certain viewpoints. But what if the solution is not to aim for a “view from nowhere” but rather to leverage different viewpoints? We introduce Plurals, a system and Python library for pluralistic AI deliberation. Plurals consists of Agents (LLMs, optionally with personas) which deliberate within customizable Structures, with Moderators overseeing deliberation. Plurals is a generator of simulated social ensembles. Plurals integrates with government datasets to create nationally representative personas, includes deliberation templates inspired by deliberative democracy, and allows users to customize both information-sharing structures and deliberation behavior within Structures. Six case studies demonstrate fidelity to theoretical constructs and efficacy. Three randomized experiments show simulated focus groups produced output resonant with an online sample of the relevant audiences (chosen over zero-shot generation in 75% of trials). Plurals is both a paradigm and a concrete system for pluralistic AI.


Joshua Ashkinaze, Julia Mendelsohn, Qiwei Li, Ceren Budak, and Eric Gilbert. "How AI ideas affect the creativity, diversity, and evolution of human ideas: Evidence from a large, dynamic experiment". Proceedings of the 2025 ACM Collective Intelligence Conference

Note: Honorable Mention 🏆

Large language models (LLMs) are trained on broad corpora and then used in communities with specialized norms. Is providing LLMs with community rules enough for models to follow these norms? We evaluate LLMs’ capacity to detect (Task 1) and correct (Task 2) biased Wikipedia edits according to Wikipedia’s Neutral Point of View (NPOV) policy. LLMs struggled with bias detection, achieving only 64% accuracy on a balanced dataset. Models exhibited contrasting biases (some under- and others over-predicted bias), suggesting distinct priors about neutrality. LLMs performed better at generation, removing 79% of words removed by Wikipedia editors. However, LLMs made additional changes beyond Wikipedia editors’ simpler neutralizations, resulting in high-recall but low-precision editing. Interestingly, crowdworkers rated AI rewrites as more neutral (70%) and fluent (61%) than Wikipedia-editor rewrites. Qualitative analysis found LLMs sometimes applied NPOV more comprehensively than Wikipedia editors but often made extraneous non-NPOV-related changes (such as grammar). LLMs may apply rules in ways that resonate with the public but diverge from community experts. While potentially effective for generation, LLMs may reduce editor agency and increase moderation workload (e.g., verifying additions). Even when rules are easy to articulate, having LLMs apply them like community members may still be difficult.


Shubham Atreja, Joshua Ashkinaze, Lingyao Li, Julia Mendelsohn, Libby Hemphill. “What's in a Prompt?: A Large-Scale Experiment to Assess the Impact of Prompt Design on the Compliance and Accuracy of LLM-Generated Text Annotations.” The 19th International AAAI Conference on Web and Social Media (ICWSM) (2025)

Manually annotating data for computational social science tasks can be costly, time-consuming, and emotionally draining. While recent work suggests that LLMs can perform such annotation tasks in zero-shot settings, little is known about how prompt design impacts LLMs' compliance and accuracy. We conduct a large-scale multi-prompt experiment to test how model selection (GPT-4o, GPT-3.5, PaLM2, and Falcon7b) and prompt design features (definition inclusion, output type, explanation, and prompt length) impact the compliance and accuracy of LLM-generated annotations on four highly relevant and diverse CSS tasks (toxicity, sentiment, rumor stance, and news frames). Our results show that LLM compliance and accuracy are prompt-dependent. For instance, prompting for numerical scores instead of labels reduces all LLMs' compliance and accuracy. Concise prompts can significantly reduce prompting costs but also lead to lower accuracy on tasks like toxicity. Furthermore, minor prompt changes like asking for an explanation can cause large changes in the distribution of LLM-generated labels. By assessing the impact of prompt design on the quality and distribution of LLM-generated annotations, this work serves as both a practical guide and a warning for using LLMs in CSS research.


Joshua Ashkinaze, Eric Gilbert, Ceren Budak. “The Dynamics of (Not) Unfollowing Misinformation Spreaders.” Proceedings of the 2024 ACM Web Conference (Formerly WWW)

Note: Selected as an oral presentation at the main conference and also selected for the special Online Trust and Safety day.

Many studies explore how people "come into" misinformation exposure. But much less is known about how people "come out of" misinformation exposure. Do people organically sever ties to misinformation spreaders? And what predicts doing so? Over six months, we tracked the frequency and predictors of ~900K followers unfollowing ~5K health misinformation spreaders on Twitter. We found that misinformation ties are persistent. Monthly unfollowing rates are just 0.52%. In other words, 99.5% of misinformation ties persist each month. Users are also 31% more likely to unfollow non-misinformation spreaders than they are to unfollow misinformation spreaders. Although generally infrequent, the factors most associated with unfollowing misinformation spreaders are (1) redundancy and (2) ideology. First, users initially following many spreaders, or who follow spreaders that tweet often, are most likely to unfollow later. Second, liberals are more likely to unfollow than conservatives. Overall, we observe a strong persistence of misinformation ties. The fact that users rarely unfollow misinformation spreaders suggests a need for external nudges and the importance of preventing exposure from arising in the first place.


Karen Bonuck, Akilah Collins-Anderson, Joshua Ashkinaze, Alison Karasz, and Amanda Schwartz. “Environmental scan of sleep health in early childhood programs.” Behavioral Sleep Medicine 18, no. 5 (2020): 598-610.

Objective: To ascertain how sleep health knowledge is translated to early care and education (ECE) programs, using a multi-component environmental scan.

Methods: A website scan identified organizations' sleep content re: recommended practices, developmental effects, and "actionable" ratings (0–2). ECE staff surveys assessed preparedness, practices, and beliefs about addressing sleep health and sleep problems in ECE programs. Semi-structured interviews with stakeholders from the ECE, pediatric and sleep communities assessed awareness, priorities, and practices at their organizations.

Results: Of 15 websites scanned, half lacked sleep content on links to development, optimal duration, or scientific background. ECE staff (n=31) were comfortable speaking to parents about healthy sleep, and with incorporating sleep education and guidance into ECE. Stakeholders (n=15) rated healthy sleep as a high relevance, but lower priority issue. Within ECE settings stakeholders reported that knowledge about specific links to health and development were poor, and that sleep health was often obscured by 'safe sleep' issues. Their recommendations included: linking sleep health to 'hot topics' such as obesity or preschool suspensions and expulsions, integrating it with the teaching of routines, and raising public awareness.

Conclusion: Despite understanding that healthy sleep promotes school readiness, there is insufficient specific, actionable information in ECE training, programs, or policies. Findings suggest a need for an awareness campaign with clear, actionable messaging, dissemination of turnkey materials, and integration with policy and professional training systems.

Currently Under Review

Li, Q., Zhang, S., Kasper, A. T., Ashkinaze, J., Eaton, A. A., Schoenebeck, S., & Gilbert, E. (2026). Reporting non-consensual intimate media: An audit study of deepfakes. Manuscript submitted for publication to ACM SIGCHI Conference on Computer-Supported Cooperative Work & Social Computing. Preprint

Goray, C., Li, Q., Ashkinaze, J., Le, V., Gilbert, E., & Schoenebeck, S. (2026). For their eyes only: What makes sharing photos of others inappropriate. Manuscript submitted for publication to ACM SIGCHI Conference on Human Factors in Computing Systems.

Kurek, L., Ashkinaze, J., Budak, C., & Gilbert, E. (2026). Follow nudges without budges: A field experiment on misinformation followers didn't change follow networks. Manuscript submitted for publication to International Conference on Web and Social Media.

Peer-Reviewed Non-Archival Conference Presentations and Posters

Joshua Ashkinaze, Ceren Budak, Eric Gilbert. "Plurals: A System for Simulating Deliberation (With Applications to Political Communication)" 18th Annual Political Networks and Computational Social Science Conference (PolNet-PaCSS), Harvard University and Northeastern University (2024)

Joshua Ashkinaze, Ceren Budak, Eric Gilbert. "PlurChain: Towards a Society of Pluralistic Artificial Intelligence" International Conference of Computational Social Science (IC2S2), University of Pennsylvania (2024) [Research Impact Nomination⭐; Resource Nomination⭐]

Joshua Ashkinaze, Eric Gilbert, Ceren Budak. "The Dynamics of (Not) Unfollowing Misinformation Spreaders" International Conference of Computational Social Science (IC2S2), University of Pennsylvania (2024) [Social Impact Nomination⭐]

Joshua Ashkinaze, Eric Gilbert, Ceren Budak. "Plurals: A system for pluralistic AI via simulated social ensembles" NeurIPS Pluralistic Alignment Workshop, Vancouver, British Columbia (2024)

Invited Talks

Joshua Ashkinaze. "Seeing Like an AI: How LLMs Apply (And Misapply) Wikipedia Neutrality Norms" Wikimedia Foundation, Online (2025)

Non Peer Reviewed Presentations

Joshua Ashkinaze. "Multi-agent focus groups for depolarizing research communication" University of Michigan Political Communication Working Group, University of Michigan--Ann Arbor (2024)

Joshua Ashkinaze. "A first look at the social dynamics of AI art diffusion" University of Michigan Media Studies Research Workshop, University of Michigan--Ann Arbor (2024)

Grants And Awards

Initiative for Democracy and Civic Empowerment, Ann Arbor, MI (October 2025)

  • Awarded $2,433 for a project using a multi-agent system I created (Plurals) for RCTs

OpenAI Researcher Access Grant, San Francisco, CA (January 2024)

  • Awarded $5,000 in credits for multiple projects related to AI creativity, AI alignment, and AI pluralism

Undergraduate Research Opportunities (UROP) Mentor Grant, Ann Arbor, MI (January 2024)

  • Awarded $2,400 grant for projects related to the effect of AI on society and mentoring undergrads

Civic Health Project Grant on LLMs and Partisan Animosity [Finalist], Palo Alto, CA (October 2023)

  • Finalist for Civic Health Project RFP on innovative uses of LLMs to reduce polarization

Phi Beta Kappa, Oberlin, OH (May 2019)

  • Inducted into honor society

Thomas Kutzen '76 Prize in Economics, Oberlin, OH (May 2019)

  • Awarded to college senior for outstanding economics research

Hanson Prize in Economics, Oberlin, OH (May 2018)

  • Awarded to college junior for excellence in economics

Jere Bruner Research Grant, Oberlin, OH (May 2016--May 2017)

  • Received grant to study how macroeconomic conditions correlate with dream content

Software

Plurals: Guiding LLMs via Simulated Social Ensembles (LINK) | Python (Jan. 2024 -- Present)

  • Open-source multi-agent library for pluralistic artificial intelligence with extensive test suite, CI/CD workflows, and auto-deployed documentation

Shell Journaling App (LINK) | Bash (Jan. 2024 -- Present)

  • A small script that acts as a daily journal manager, can be run entirely from the command line

Mathematical Art (LINK) | Java, Javascript, Python, Processing, GLSL (Jan. 2019 -- Present)

  • Created 20+ mathematical art algorithms and 250+ compositions using probability theory and color theory; displayed at 3 exhibitions and sold 75+ prints

EmotaPal (LINK) | Python, SciKit, NLTK, Pandas (July 2019 -- July 2020)

  • Built open-source package that maps color palettes to emotions using supervised learning and NLP. Note: This was pre multi-modal LLMs!