Examen

Reflection on Content Moderation in Social Media and AI Lecture

0 fois vendu

Cours
Reflection on Content Moderation in Social Media a

Établissement
Reflection On Content Moderation In Social Media A

Reflection on Content Moderation in Social Media and AI LectureI hope this memo finds you well. I recently attended a thought-provoking lecture on "Content Moderation in Social Media and AI" by By Serge Abiteboul, which has prompted me to reflect on some critical issues raised during the session....

[Montrer plus]

Aperçu 4 sur 62 pages

Voir l'exemple

Publié le 11 novembre 2024
Nombre de pages 62
Écrit en 2024/2025
Type Examen
Contient Questions et réponses

reflection on content moderation in social media a

Établissement Reflection on Content Moderation in Social Media a
Cours Reflection on Content Moderation in Social Media a

Onlystudents

Membre depuis 2 année 94 documents vendus

€10,30

Ajouter au panier

Ajouter au liste de veux

Garantie de satisfaction à 100%
Disponible immédiatement après paiement
En ligne et en PDF
Tu n'es attaché à rien

+
A+

Problem 1

Subject: Reflection on Content Moderation in Social Media and AI Lecture

I hope this memo finds you well. I recently attended a thought-provoking lecture on "Content Moderation in
Social Media and AI" by By Serge Abiteboul, which has prompted me to reflect on some critical issues raised
during the session.

Social media platforms are like huge online meeting places where people from all over the world gather. Social
media can sway our opinions and decisions. When you see something trending or popular, it can affect what
you think or believe.
They help us stay connected with friends, family, and even strangers. You can chat, share photos, and follow
people you admire.
Social media can influence elections and political discussions. People express their views, and sometimes it can
sway the outcome. However, Social media isn’t all rainbows and unicorns. It has its problems too. Some folks
get addicted, spending too much time scrolling. Others use it to spread hate speech or harass others.

One key takeaway from the lecture was the concept of responsibility in social media governance. While social
media platforms wield considerable power akin to states, they lack the same level of legitimacy and
accountability. Unlike elected governments, these platforms operate based on terms of service agreements that
users often overlook. This raises questions about the legitimacy of their authority to moderate content and shape
societal norms.

Furthermore, the lecture shed light on the complexities of content moderation, emphasizing the scale, variety,
and velocity of content that platforms must navigate. From terrorism to hate speech to fake news, the spectrum
of problematic content is vast and evolving. Moreover, the ambiguity of human language and the lack of
context further complicate moderation efforts, making it challenging to discern between harmful and benign
content.

Machine learning algorithms were presented as a promising solution for content moderation due to their
scalability and efficiency. However, they are not without limitations, particularly in addressing nuanced issues
such as hate speech and fake news. While algorithms may outperform humans in detecting certain types of
content, they still require human oversight to ensure accuracy and mitigate biases.

, +
A+

The lecture also discussed regulatory approaches to address content moderation challenges. While self-
regulation by social media companies has proven inadequate, allowing states to regulate poses risks to freedom
of speech and expression. An alternative proposed was the supervision of social media platforms by
independent regulators, working in collaboration with stakeholders to establish reasonable standards for content
moderation.

Moreover, the lecture emphasized the importance of user education and community involvement in combating
misinformation and fostering a healthier online environment. Platforms that prioritize truthfulness and
community-driven moderation models, such as Wikipedia and WT.Social, offer promising alternatives to the
current ad-driven business models prevalent in social media.

In conclusion, the lecture underscored the urgent need for a comprehensive approach to address the
complexities of content moderation in social media. As a team, we must remain vigilant and proactive in
promoting responsible online behavior, advocating for transparency and accountability in platform governance,
and supporting initiatives that prioritize user well-being and societal values.

Problem 2 (10 points) : Data science lifecycle
Alex, a data scientist in the human resources department of a technology company MegaSoft, is
running job applicants through an already trained model that ranks applicants for job openings
at the company. The applicant dataset has four features: sex, race, experience (measured in
years), and performance on a skills test.Alex conducts data profiling on the applicant dataset and produces the
table below. She finds that all demographic groups perform comparably on the skills test (results are not
shown), but that experience differs across groups. Further, she knows that every applicant must take a skills test
to apply for a job at MegaSoft, but that some applicants do not report their experience.

(a) (3 points) To prepare the data to run through the ranking model, Alex replaces missing
values (NULL) in the experience feature with the overall mean value for that feature in the
dataset. She expects the ranking model to prioritize (rank higher) those applicants who score

, +
A+

higher on the skills test and have more years of experience. Looking at the data profiling table
above, which applicant group(s) may be disadvantaged by Alex’s imputation method and why?
Answer
The groups that may be disadvantaged by Alex's imputation method are those with a higher proportion
of missing values in the experience feature.
In this case, it appears that male individuals, particularly those who are white, have a higher proportion
of missing values (80 for white, 63 for other) compared to their female counterparts (60 for white, 42 for
other).
By imputing missing values with the overall mean, these groups might be unfairly penalized because their
actual experiences are not considered, potentially skewing the ranking results.

(b) (3 points) Propose an alternative data imputation method that may improve the ranking of
individuals from the group(s) you identified in (a).
Answer
An alternative data imputation method could involve imputing missing values based on subgroup means
rather than the overall mean.
This method would consider the mean experience within each subgroup (e.g., male white, male other,
female & non-binary white, female & non-binary other) and impute missing values accordingly.
By doing so, the imputation would better reflect the experience distribution within each subgroup,
potentially mitigating the disadvantage faced by groups with higher proportions of missing values.

(c) (4 points) The data imputation method described in (a) can introduce technical bias.
Explain how this type of technical bias relates to pre-existing and emergent bias in MegaSoft’s
hiring example. Be concise and concrete.
By imputing missing values with the overall mean, which does not consider the unique characteristics
and experiences of different demographic groups, pre-existing biases can be perpetuated.
For example, if certain demographic groups have historically faced barriers to reporting their experience
accurately, they may have higher proportions of missing values.

, +
A+

Imputing these missing values with the overall mean could systematically disadvantage these groups,
further entrenching pre-existing biases in MegaSoft's hiring process.

Emergent biases are new biases that arise as a result of the data imputation method and its impact on the
hiring process.
In this case, the technical bias introduced by imputing missing values with the overall mean could lead to
emergent biases by systematically favoring certain demographic groups over others.
For instance, if certain demographic groups are more likely to have missing values in the experience
feature, imputing these values with the overall mean could unfairly penalize these groups and skew the
ranking results.
This emergent bias could lead to discriminatory hiring outcomes, where certain groups are systematically
disadvantaged in the hiring process.

Problem 3 (40 points): Explaining text classification with SHAP
For this programming portion of this assignment, we will use a subset of the text corpus from the
20 newsgroups dataset. This is the dataset used in the LIME paper to generate the
Christianity/Atheism classifier, and to illustrate the concepts. However, rather than explaining
predictions of a classifier with LIME, we will use this dataset to explain predictions with SHAP.
(a) (5 points) Use the provided Colab template notebook to import the 20 newsgroups dataset
from sklearn.datasets, importing the same two-class subset as was used in the LIME paper: Atheism and
Christianity. Use the provided code to fetch the data, split it into training and test sets, then fit a TF-IDF
vectorizer to the data, and train a SGDClassifier classifier.

(b) (5 points) Generate a confusion matrix (hint: use sklearn.metrics.confusion_matrix) to
evaluate the accuracy of the classifier. The confusion matrix should contain a count of correct
Christian, correct Atheist, incorrect Christian, and incorrect Atheist predictions. Use SHAP’sexplainer to
generate visual explanations for any 5 documents in the test set. The documents you select should
include some correctly classified and some misclassified documents.

(c) (15 points) Use SHAP’s explainer to study mis-classified documents, and the features
(words) that contributed to their misclassification, by taking the following steps:
1. Report the accuracy of the classifier, as well as the number of misclassifieddocuments.

Les avantages d'acheter des résumés chez Stuvia:

Qualité garantie par les avis des clients

Les clients de Stuvia ont évalués plus de 700 000 résumés. C'est comme ça que vous savez que vous achetez les meilleurs documents.

L’achat facile et rapide

Vous pouvez payer rapidement avec iDeal, carte de crédit ou Stuvia-crédit pour les résumés. Il n'y a pas d'adhésion nécessaire.

Focus sur l’essentiel

Vos camarades écrivent eux-mêmes les notes d’étude, c’est pourquoi les documents sont toujours fiables et à jour. Cela garantit que vous arrivez rapidement au coeur du matériel.

Foire aux questions

Qu'est-ce que j'obtiens en achetant ce document ?

Vous obtenez un PDF, disponible immédiatement après votre achat. Le document acheté est accessible à tout moment, n'importe où et indéfiniment via votre profil.

Garantie de remboursement : comment ça marche ?

Notre garantie de satisfaction garantit que vous trouverez toujours un document d'étude qui vous convient. Vous remplissez un formulaire et notre équipe du service client s'occupe du reste.

Auprès de qui est-ce que j'achète ce résumé ?

Stuvia est une place de marché. Alors, vous n'achetez donc pas ce document chez nous, mais auprès du vendeur Onlystudents. Stuvia facilite les paiements au vendeur.

Est-ce que j'aurai un abonnement?

Non, vous n'achetez ce résumé que pour €10,30. Vous n'êtes lié à rien après votre achat.

Peut-on faire confiance à Stuvia ?

4.6 étoiles sur Google & Trustpilot (+1000 avis)

69052 résumés ont été vendus ces 30 derniers jours

Fondée en 2010, la référence pour acheter des résumés depuis déjà 15 ans

Commencez à vendre!

Examen

Reflection on Content Moderation in Social Media and AI Lecture

Infos sur le Document

Sujets

École, étude et sujet

Vendeur

Avis reçus

Aperçu du contenu