3/18 (Sat.) The Era of Big Data (Host: Sherry)

3/18 (Sat.) The Era of Big Data (Host: Sherry)

文章Sherry Liao » 週日 3月 12, 2017 10:42 pm

When the Big Lie Meets Big Data
by Peter Bruce
https://blogs.scientificamerican.com/gu ... -big-data/

Joseph Goebbels said “If you tell a lie big enough and keep repeating it, people will eventually come to believe it.” In the era of big data, however, numerous smaller lies, guided by machine learning, may be more effective than a few big lies. And even the literal truth seasoned with innuendo will do. Studies show that the salient concept in a statement can persist and dominate the literal truth in which it is embedded. For example, consider the statement, “There is no evidence that Hillary Clinton ran a child sex slavery operation out of Comet Pizza.” Repeat it often enough, and people are prone to remember just the “Hillary Clinton… child sex slavery” concept, not its negation.

President Trump’s disregard for the truth often seems impulsive, and not strategic—a bit like that of a braggart in a bar (were it not for the fact that he is a teetotaler). Reacting angrily to news reports that his inauguration crowd was not as big as that of President Obama when he first came into office, or even as big as the protesting crowd gathered by the Women’s March, Trump insisted (and had his press secretary insist) that his inaugural crowds were the biggest in U.S. history (not true). During the campaign, he claimed to have seen people in Jersey City celebrating the 9/11 attack by dancing in the street (also not true). After the election, he claimed that millions of votes by illegal immigrants cost him the popular vote (no evidence).

President Trump’s enthusiastic embrace of casual lying is partly seen as a reflection and a product of the general phenomenon of “fake news.” The impact of fake news is real—Comet Pizza, a restaurant in Washington DC, was indeed believed by many to be the site of a child sex trafficking ring that was operating with the involvement of Hillary Clinton. It was the scene of a bizarre siege on Dec. 4 when Edgar Welch, heavily armed, entered the restaurant and announced that he was there to investigate matters. The sex-slave story was pure fiction, but it was all over the internet. Edgar Welch believed it, and traveled to DC from North Carolina, ready to do battle.

Where does fake news come from? It turns out there is a profitable cottage industry in creating and purveying fake news. NPR tracked down one fake-news creator, Jestin Coler, in a Los Angeles suburb. Coler started out catering to the alt-right, and found a rapidly expanding audience during the recent presidential campaign—“a huge Facebook group of kind of rabid Trump supporters just waiting to eat up this red meat.”

Jestin Coler is a Democrat, and says he wants to expose the phenomenon of fake news. But the money is good (over $100,000 per year), and, in the end, what he wanted people to do was to click on his stories so he could collect advertising revenue.

But suppose you were interested in getting someone to do something more than click? Something political—like vote for a particular candidate, go to a rally, write your Congressman, etc.? That’s where Big Data comes into the picture.

Since 2004, political consultants have used big-data models to predict how people will vote, and indicate whether they should be sent messages to encourage them to do so (and if so, which messages). They use randomized experiments (A-B tests) to determine the effect of different messages at the individual level, and correlate this with other variables, such as demographic data and voting data, to build predictive models. All this is similar to what happens in the marketing realm (e.g. should a given consumer be sent solicitation A or B), and President Obama was a pioneer in the use of predictive analytics to target individual voters.

The science of predictive modeling has come a long way since 2004. Statisticians now build “personality” models and tie them into other predictor variables. Edgar Welch can now be targeted for messaging not simply on the basis of his demographic voting behavior, but on the basis of a personality classification derived from reams of detailed personal data available for purchase. One such model bears the acronym “OCEAN,” standing for the personality characteristics (and their opposites) of openness, conscientiousness, extroversion, agreeableness, and neuroticism. Using Big Data at the individual level, machine learning methods might classify a person as, for example, “closed, introverted, neurotic, not agreeable, and conscientious.”

Alexander Nix, CEO of Cambridge Analytica (owned by Trump’s chief donor, Rebekah Mercer), says he has thousands of data points on you, and every other voter: what you buy or borrow, where you live, what you subscribe to, what you post on social media, etc. At a recent Concordia Summit, using the example of gun rights, Nix described how messages will be crafted to appeal specifically to you, based on your personality profile. Are you highly neurotic and conscientious? Nix suggests the image of a sinister gloved hand reaching through a broken window.

In his presentation, Nix noted that the goal is to induce behavior, not communicate ideas. So where does truth fit in? Johan Ugander, Assistant Professor of Management Science at Stanford, suggests that, for Nix and Cambridge Analytica, it doesn’t. In counseling the hypothetical owner of a private beach how to keep people off his property, Nix eschews the merely factual “Private Beach” sign, advocating instead a lie: “Sharks sighted.” Ugander, in his critique, cautions all data scientists against “building tools for unscrupulous targeting.”

The warning is needed, but may be too late. What Nix described in his presentation involved carefully crafted messages aimed at his target personalities. His messages pulled subtly on various psychological strings to manipulate us, and they obeyed no boundary of truth, but they required humans to create them. The next phase will be the gradual replacement of human “craftsmanship” with machine learning algorithms that can supply targeted voters with a steady stream of content (from whatever source, true or false) designed to elicit desired behavior. Cognizant of the Pandora’s box that data scientists have opened, the scholarly journal Big Data has issued a call for papers for a future issue devoted to “Computational Propaganda.”

Hopefully, it will address broader ethical and policy issues, and not be a “how to” manual.

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Each of the questions comes from an article or a video clip. Please refer to the following articles /video clip for further information:
Q1. How Target Figured Out A Teen Girl Was Pregnant Before Her Father Did
Q2. Facebook reveals news feed experiment to control emotions
Q3. 4 Scary Things About Big Data And You

Q4. (the main article)
Q5. Mark Zuckerberg: I want to share some thoughts on Facebook and the election
Q6. Hyperpartisan Facebook Pages Are Publishing False And Misleading Information At An Alarming Rate

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Questions for discussion:
Session I
1. What’s your opinions on tailored advertising? If you got pregnant and were due in a few months, and then you received mailed coupons for baby clothes and cribs from a new shop, would it make you happy or make you feel your privacy had been violated?
2. In 2014, Facebook conducted an experiment (w/ Cornell and the University of California) in which it manipulated 689,000 users’ news feeds. The experiment results showed that when Facebook reduced users’ exposure to their friends’ “positive emotional content”, users resulted in fewer positive posts of their own; and when reduced exposure to “negative emotional content”, the opposite happened.
What do you think about this experiment? In what conditions would you accept media, online shopping sites, or social network sites to collect your data, classify you, and try to predict your behaviors?
3. There is a saying that goes, “Whenever you type something online, just assume it’s public record.” Is the convenience or the perceived convenience of targeted advertising and relevant search results worth the potential of losing your privacy? Would you take any actions to prevent your information from becoming public knowledge?

Session II
4. In the article, Alexander Nix said numerous crafted messages may be specifically aimed at voters, based on the analysis and classification of their personality, and the goal is to induce behavior, not communicate ideas. Do you believe it’s possible? If so, will it happen in Taiwan?
5. Some believe that fake news on Facebook have influenced the outcome of the 2016 US presidential election. On November 13, 2016, Mark Zuckerberg said he believed “Of all the content on Facebook, more than 99% of what people see is authentic. Only a very small amount is fake news and hoaxes”, and “this makes it extremely unlikely hoaxes changed the outcome of this election in one direction or the other.”
What do you think? How do you trust online information? How can you tell if a website or an information source is reliable?
6. Please read this statement: “The best way to attract and grow an audience for political content on the world’s biggest social network is to eschew factual reporting and instead play to partisan biases using false or misleading information that simply tells people what they want to hear.” and let us have your opinions on it.
「要吸引觀眾看些政治性內容,最好的辦法就是避開事實,並在文章裡塞滿具有政黨偏見的錯誤訊息,只告訴人們他們想聽到的東西。」

Agenda:
3:45 ~ 4:00pm Greetings & Free Talk / Ordering Beverage or Meal / Getting Newcomer’s Information
4:00 ~ 4:10pm Opening Remarks / Newcomer’s Self-introduction / Grouping
(Session I)
4:10 ~ 4:50pm Discussion Session (40 mins)
4:50 ~ 5:10pm Summarization (20 mins)
5:10 ~ 5:15pm Regrouping / Instruction Giving / Taking a 10 Minutes Break (Intermission)
(Session II)
5:15 ~ 5:55pm Discussion Session (40 mins)
6:00 ~ 6:20pm Summarization (20 mins)
6:20 ~ 6:30pm Concluding Remarks / Announcements
************************************************************
聚會日期:列於該貼文主題內
聚會時間:請準時 4:00 pm 到 ~ 約 6:30 pm 左右結束
星期六聚會地點:丹堤濟南店
地址、電話:台北市濟南路三段25號 地圖 (02) 2740-2350
捷運站:板南線 忠孝新生站 3 號出口
走法:出忠孝新生站 3 號出口後,沿著巷子(忠孝東路三段10巷)走約 2 分鐘,到了濟南路口,左轉走約 2 分鐘即可看到。
最低消費: 80 元

注意事項:
1. 文章是否需要列印請自行斟酌,但與會者請務必自行列印 Questions for discussion。
2. 與會者請先閱讀過文章,並仔細想過所有的問題,謝謝合作!

給新朋友的話:
1. 請事先準備2~3分鐘的英語自我介紹;會議結束前可能會請你發表1~2分鐘的感想。
2. 請事先閱讀文章以及主持人所提的討論問題,並事先寫下自己所欲發表意見的英文。
3. 全程以英語進行,參加者應具備中等英語會話能力,對任一討論問題,能夠以5到10句英文表達個人見解。
4. 在正式加入之前,可以先來觀摩三次,觀摩者亦須參與討論。正式加入需繳交終身會費 NT$1,000。
Sherry Liao
Officer-Chief Host
 
文章: 1380
註冊時間: 週五 12月 07, 2007 12:15 pm

Re: 3/18 (Sat.) The Era of Big Data (Host: Sherry)

文章Rock » 週一 3月 13, 2017 8:07 pm

This article demands serious study. Need to take some notes first.

1. the acronym “OCEAN,” standing for the personality characteristics (and their opposites) of openness, conscientiousness, extroversion, agreeableness, and neuroticism.
2. cottage industry
3. ....
In matters of style, swim with the current; in matters of principle, stand like a rock.
頭像
Rock
President
 
文章: 1635
註冊時間: 週三 10月 31, 2007 9:03 am

Re: 3/18 (Sat.) The Era of Big Data (Host: Sherry)

文章Iris Wu » 週四 3月 16, 2017 10:04 am

Not sure if it is the writing style issue or something else, I am afraid that the talking points in this article are not easy to comprehend. One possible reason is that many examples used do not tie to his points closely (the way it was written) and the article did not explain some of basic concepts of Big Data, but jump into its issues using "big" vocabulary (fancy-schmancy words). It is probably a common problem in technical writing I think, or they may just target different groups of readers.

Following are my study notes:
• Before: Lie --> big and repeat --> became “facts”/believed
• Now: small lies + machine learning --> effective (e.g. Clinton & childsex)
• Trump:
    o Disregard for the truth: does not look for evidence (like braggart in a bar)
    o His embrace of casual lying --> It's a product of fake news.
• How and why are fake news produced/created/baked? --> (because of) Good money!!!
    Example: Jestin Coler is a Democrat, and says he wants to expose the phenomenon of fake news. But the money is good (over $100,000 per year), and, in the end, what he wanted people to do was to click on his stories so he could collect advertising revenue.
• Since 2004, political consultants have used big-data models to predict how people will vote, and indicate whether they should be sent messages to encourage them to do so (and if so, which messages). --> predictive modeling (2004) --> Statisticians now build “personality” models and tie them into other predictor variables.

• Big data collection: many personal data points collected:
    o Example: “OCEAN,” standing for the personality characteristics (and their opposites) of openness, conscientiousness, extroversion, agreeableness, and neuroticism.
    o Using Big Data at the individual level, machine learning methods might classify a person as, for example, “closed, introverted, neurotic, not agreeable, and conscientious.”
• Some set the objective (of big data usage): To induce behavior, not communicate ideas --> arouse the interest and action --> click and do something (Example: Beach sign “Shark Sighted”) --> It may not be true, but they don't care. ("they obeyed no boundary of truth.")

Where does truth fit in? It doesn’t. (Nix)
    “a sinister gloved hand reaching through a broken window”
• Nix conclusions:
    o Past: The manipulators crafted messages to target certain group of people.
    o Future (next phase):
      Machine learning algorithms will replace the human manipulation
       But the algorithms can still be manipulated by “big data usage designers (= data scientists)
    o We need broader ethical and policy solutions to address possible “Computational Propaganda” issues.

Please feel free to correct me. This is just to encourage more readers to join our meeting discussion on Saturday. :)
Iris Wu
Ex-Vice President
 
文章: 472
註冊時間: 週二 5月 20, 2014 4:33 pm

Re: 3/18 (Sat.) The Era of Big Data (Host: Sherry)

文章Sherry Liao » 週五 3月 17, 2017 1:08 pm

Rock 寫:This article demands serious study. Need to take some notes first.

1. the acronym “OCEAN,” standing for the personality characteristics (and their opposites) of openness, conscientiousness, extroversion, agreeableness, and neuroticism.
2. cottage industry
3. ....


I didn't take it that seriously before I read you and Iris' note :shock:
For me, this article is more like putting a plot of science fiction into a real world... Orz :o :o :o
Sherry Liao
Officer-Chief Host
 
文章: 1380
註冊時間: 週五 12月 07, 2007 12:15 pm

Re: 3/18 (Sat.) The Era of Big Data (Host: Sherry)

文章Sherry Liao » 週五 3月 17, 2017 1:16 pm

Iris, you are marvelous! I learned a lot from your notes. :D :D

Iris Wu 寫:• How and why are fake news produced/created/baked? --> (because of) Good money!!!
    Example: Jestin Coler is a Democrat, and says he wants to expose the phenomenon of fake news. But the money is good (over $100,000 per year), and, in the end, what he wanted people to do was to click on his stories so he could collect advertising revenue.
• Since 2004, political consultants have used big-data models to predict how people will vote, and indicate whether they should be sent messages to encourage them to do so (and if so, which messages). --> predictive modeling (2004) --> Statisticians now build “personality” models and tie them into other predictor variables.



In my opinions, it's not necessary that fake news is produced/created/baked because of money. Jestin Coler's case is probably just an example. I think people have a variety of motives to produce fake news.
Sherry Liao
Officer-Chief Host
 
文章: 1380
註冊時間: 週五 12月 07, 2007 12:15 pm

Re: 3/18 (Sat.) The Era of Big Data (Host: Sherry)

文章Sherry Liao » 週五 3月 17, 2017 1:23 pm

Hi yoyos,

Each question for discussion comes from an article or a video clip. The following are the source of these articles/video clip, in case you want to read a bit more:
Q1. How Target Figured Out A Teen Girl Was Pregnant Before Her Father Did
https://www.forbes.com/sites/kashmirhil ... 443dc26668
Q2. Facebook reveals news feed experiment to control emotions
https://www.theguardian.com/technology/ ... news-feeds
Q3. 4 Scary Things About Big Data And You
https://www.youtube.com/watch?v=dQNpj7PxIu0
Q4. (the main article)
Q5. Mark Zuckerberg: I want to share some thoughts on Facebook and the election
https://www.facebook.com/zuck/posts/10103253901916271
Q6. Hyperpartisan Facebook Pages Are Publishing False And Misleading Information At An Alarming Rate
https://www.buzzfeed.com/craigsilverman ... .kbKKXjw1M

A Chinese version of question 6 is provided from this source:
https://dq.yam.com/post.php?id=6854
Sherry Liao
Officer-Chief Host
 
文章: 1380
註冊時間: 週五 12月 07, 2007 12:15 pm

Re: 3/18 (Sat.) The Era of Big Data (Host: Sherry)

文章Luis Ko » 週五 3月 17, 2017 1:35 pm

that's what the author wrote.

yeah, i agree with Iris. the author doesn't specify the example to support his points. instead, what he did was trying to denounce Trump's campaign, insinuating that fake news played a big part in Trump's winning the election.. he must be a supporter of Hillary Clinton and, a naysayer of Donald Trump haa~ 8)
i might be a cynic and, a sceptic as well but, i'm definitely not a bad person!!
Luis Ko
YOYO member
 
文章: 737
註冊時間: 週三 6月 06, 2007 10:18 pm

Re: 3/18 (Sat.) The Era of Big Data (Host: Sherry)

文章Rock » 週六 3月 18, 2017 10:11 pm

We helped Sherry does her homework. No wonder the topic seems technical. Now everybody attended the meeting today is a graduate school student, if you know what I mean.
Attendee list: Sherry (host), Rock, Katherine, Luis, Danny, Jefferson (new comer), Christ (long time no see!), Julian, Kat, Sabrina, Jessica, Laura, Light, Iris.... who do I miss? :roll:
In matters of style, swim with the current; in matters of principle, stand like a rock.
頭像
Rock
President
 
文章: 1635
註冊時間: 週三 10月 31, 2007 9:03 am

Re: 3/18 (Sat.) The Era of Big Data (Host: Sherry)

文章Luis Ko » 週日 3月 19, 2017 3:35 pm

"The experiment results showed that when Facebook reduced users’ exposure to their friends’ “positive emotional content”, users resulted in fewer positive posts of their own; and when reduced exposure to “negative emotional content”, the opposite happened."



i still can't quite get this sentence. the last part "the opposite happened"? i thought it should be opposite to reduce or fewer, but it sounded illogical and i was told it was not like that. how come? is the usage good or ambiguous? :ccry:
i might be a cynic and, a sceptic as well but, i'm definitely not a bad person!!
Luis Ko
YOYO member
 
文章: 737
註冊時間: 週三 6月 06, 2007 10:18 pm

Re: 3/18 (Sat.) The Era of Big Data (Host: Sherry)

文章Rock » 週一 3月 20, 2017 8:07 am

I guess, if he used "the same thing happened", then people would take it as "the fewer positive posts of their own". So he used "the opposite happened" to mean....

Yes, you are right, it's ambiguous, "the opposite" could mean two things: "the more positive" or "the fewer negative". Good job, bro. :lol:
In matters of style, swim with the current; in matters of principle, stand like a rock.
頭像
Rock
President
 
文章: 1635
註冊時間: 週三 10月 31, 2007 9:03 am

Re: 3/18 (Sat.) The Era of Big Data (Host: Sherry)

文章Luis Ko » 週一 3月 20, 2017 1:17 pm

ok, now i got it!! instead of fewer negative posts, it 's supposed to mean the opposite of "users resulted in fewer positive posts of their own", which is more positive posts, if not ambiguous.. thank you for your explanation, bro~ :lol:
i might be a cynic and, a sceptic as well but, i'm definitely not a bad person!!
Luis Ko
YOYO member
 
文章: 737
註冊時間: 週三 6月 06, 2007 10:18 pm

Re: 3/18 (Sat.) The Era of Big Data (Host: Sherry)

文章Rock » 週一 3月 20, 2017 8:03 pm

I guess you're one of the few people who take this ambiguity seriously. :D
In matters of style, swim with the current; in matters of principle, stand like a rock.
頭像
Rock
President
 
文章: 1635
註冊時間: 週三 10月 31, 2007 9:03 am

Re: 3/18 (Sat.) The Era of Big Data (Host: Sherry)

文章Luis Ko » 週一 3月 20, 2017 10:05 pm

that's either because my English sucks or, i'm seriously way too serious la~ XD
i might be a cynic and, a sceptic as well but, i'm definitely not a bad person!!
Luis Ko
YOYO member
 
文章: 737
註冊時間: 週三 6月 06, 2007 10:18 pm


回到 每週討論主題 Meeting Topics

誰在線上

正在瀏覽這個版面的使用者:沒有註冊會員 和 26 位訪客

cron