Sunday, July 26, 2020

Will computers be allowed to do more than "aping"

Yesterday I completed reading, "What Can Bonobos Teach Us About The Nature of Language," in the July-August 2020 edition of the Smithsonian Magazine. Lindsay Stern (TwitterLinkedIn) wrote the article, which is about the Bonobo ape and their ability to grasp complex language concepts using lexigrams. The research was pioneered by Dr. Sue Savage-Rumbaugh (Linkedin). Sue was able to communicate with lexigrams to the Bonobos apes. She would show a banana and point to a specific symbol. Over time, the bonobos mapped the symbol for bananas, apples, and other objects and communicated with Sue and other researchers. Dr. Savage-Rumbaugh was eventually able to communicate complex thoughts using the lexigrams. Dr. Savage-Rumbaugh was eventually replaced due to the close bond she developed with the bonobos. The current researches state that Dr. Savage-Rumbaugh deviated from the scientific rigors in her research and, in many ways, neglected the apes.

In her Smithsonian article, Stern questions if humans don't want to develop a relationship with the bonobos because it makes us less human or the apes more human. The thought is intriguing, and it led me to ponder if humanity will a similar struggle with machines in the near or distant future. After all, we teach computers using deep-learning algorithms to map images, which can be equated to lexigrams, to objects. Unlike animals, machines are predictable and can be harnessed to superhuman tasks like crunching through petabytes of data in a matter of minutes. If a PC or a Mac says, "I don't want to do the task, " then we replace the machinery. If an employee says, " I don't want to do the task," then the employee is disciplined or even fired. Is it a control thing; or is it our biases (acquired or built-in)


As a trained data scientist, I was taught that it is tough to remove the audience's biases. The best a data scientist can do is to provide the data in a consumable format that will enable the audience to get insights and make decisions.   


Organizations and governments fund research to learn about things that can be leveraged to benefit their respective groups and, eventually, humanity. However, these entities will not fund research that questions the underlying belief systems and makes us less human or our research subject or technology more human. Hopefully, you enjoyed the blog entry because the Smithsonian article made me reflect on the work we do and how we contribute and transform the "human race."


Tuesday, February 4, 2020

Surprise, Surprise,...blah tech isn't working

Surprise. Surprise. A few days ago, IBM replaced its CEO Ginni Rometti, with Arvind Krishna. According to Barrons, IBM sales have fallen by 25% in the last eight years, while other tech giants sales are doing well. In the previous eight years, I haven't seen IBM make the rounds to sell their products like DB2, Watson, WebSphere, and others.  They are not marketing their services well.

These days when I think about IBM, I see it as a "blah" technology company that lacks excitement. I used to think of Microsoft the same way. Things changed after Satya Nadella took reigns at Microsoft. They made a few exciting acquisitions that tell me Microsoft (specifically the current CEO) have a vision. These acquisitions diversify their offering of services. Here is the timeline:
  1. Microsoft made its flagship product Microsoft Office accessible on non-Windows devices.
  2. Microsoft purchased Minecraft for $2.5 billion. I thought this was a head-scratcher, but my kids love Minecraft. Right off the bat, Microsoft made inroads with the next generation, which will be earning income in the five to ten years.
  3. Microsoft introduces Windows 10, and it is a solid Operation System with a smooth user experience. When l look at Windows 7 and others, I am reminded of Steve Balmer and his poor image of screaming at developers. Steve Balmer and his version of Microsoft.  Steve is not missed. He is better off running the Los Angeles Clippers.
  4. Microsoft then bought LinkedIn.
  5. Microsoft purchased Github.  As a developer,  I thought it was huge because it showed Microsoft's commitment to the developer community and the open-source community.

When I see IBM,  I don't see any innovation, and it appears to be the "Big Blue Stale Machine."  Is IBM's Watson ready to collect its Social Security checks because it seems to be aging pretty fast?  The company I am really concerned about is Google. Google recently stopped offering its core product Google Search Appliance, and it bundled it with its email.  Gmail is good, but it doesn't provide the email experience of Outlook.  Google's founders stepped away from leadership and gave it an up-and-coming CEO  Sundar Pichai.  The sudden disappearing act by founding computer scientists Larry Page and Sergey Bring doesn't build my confidence in Google.  Memes like this one below or The Wallstreet Journal article will not help Google's image.
For Google to build their trust with me, I want to see Google's founders take a more active role in the technical community.  I would also like to see user-centric innovation.  Having the most robust and secure cloud in the market is not appealing to me because they don't market it, and their tools are not user-friendly.  It's time for Larry and Sergi to stop playing in their lab and step up with the big boys like Satya Nadella, MarkZuckerburg, Reed Hastings, Tim Cook, and others.

Speaking about Tim Cook, who is the CEO of Apple Inc.  He has not done anything earth-shattering, but he has kept Apple with the times.  The launch of Apple TV is great, but I would like to see more innovation come out of Apple. Shrewd CEOs are strategic, operations focused, and can be ruthless.  Very few leaders are also innovative.  I see Steve Jobs in this category.  As you may know, the US government is trying to wrestle with Apple to get into its devices via a backdoor.  It will be interesting to see if Apple will every give-in.

Unlike Apple, Mark Zuckerburg, and his company Facebook seem to give in to external pressures.  According to Facebook, they are going to let politicians publish political ads.  Facebook is trying to diversify its portfolio of products, but I don't know how they can much progress since I don't trust.  It's going to be hard for Facebook to get back in the game until they gain back the public trust. It can be done.

I am interested to see how Netflix will evolve. Scott Hastings and his team revolutionized streaming digital content to mass audiences. Netflix has created a great market for actors, actresses, producers, directors, and other auxiliary roles for making movies, documentaries, and tv shows. With Disney+, Peacock TV, CBS Access, HBO, and others are trying to catch up to Netflix. According to my graphic, Netflix needs to move into gaming and devices to differentiate itself from other sorts of offerings.

I believe Amazon has reached its potential with e-commerce, and I like the way they are going with Amazon cloud and digital content. Amazon will evolve and begin to the left on my image via Amazon Web Services (AWS).

In summary,  big tech companies need to offer a diverse set of services which gives them the flexibility to grow and help them to minimize the risk.

Saturday, January 18, 2020

ABCDEF - Astros, Baseball, Cheating, Data, Ethics, and Finances

“…Luhnow is widely considered to be one of the most successful baseball executives of his generation, credited with ushering in the second “analytics” revolution in baseball and rebuilding the Houston Astros into a perennial Postseason contender. But while no one can dispute that Luhnow’s baseball operations department is an industry leader in its analytics, it is very clear to me that the culture of the baseball operations department, manifesting itself in the way its employees are treated, its relations with other Clubs, and its relations with the media and external stakeholders, has been very problematic. At least in my view, the baseball operations department’s insular culture – one that valued and rewarded results over other considerations, combined with a staff of individuals who often lacked direction or sufficient oversight…”
-          Excerpt from the Statement of the Commissioner

If you are a baseball fan, then your world got rocked by Major LeagueBaseball’s (MLB) Commissioner Rob Manfred’s penalties on the Houston Astros General Manager (GM) Jeff Luhnow and manager A J Hinch.  They are banned from MLB baseball for a year without pay.  Subsequently the owner of Houston Astros Jim Crane fired both of them.  Jeff and A J were fined because they oversaw the 2017 Houston Astros baseball team, which cheated in baseball games using technology.  The following few days after the firings, Alex Cora,  the Boston Red Sox manager, and Carlos Beltran, the New York Mets manager, stepped down.  Both of them were mentioned in Rob Manfred's Statement of the Commissioner report. Alex was bench coach of the 2017 Houston Astros, and Carlos Beltran was a player on the 2017 Houston Astros team.

The 2017 Houston Astros used technology to steal the opposing team’s catcher’s signs.  The catcher’s signs tell the pitcher what pitch the pitcher will throw.  If the batter knows the pitch that he will see, then the batter has a better chance of hitting the ball.  It alters the odds provides an edge to the batter.


If you don't know much about the game baseball, here is a video that provides the basics of the game.

Other than baseball pundits and historians debate if the penalities on Jeff and AJ were harsh enough and the adverse impact to the game,  the following statements in the commissioner's report caught my attention,
"Luhnow’s baseball operations department is an industry leader in its analytics," and
"the baseball operations department’s insular culture – one that valued and rewarded results over other considerations."  

Ideally, in any data and analytics organization, decisions are based on data and not on emotions and "gut feeling."  The Houston Astros focused on the data and results and forgot their mission, which is to play honest baseball and compete.   MLB players, managers, and teams are tempted to cheat because they can win, which in turn leads to more lucrative financial contracts.

Speaking about money. Several years ago, I distinctly remembered listening to a Fresh Air episode on the National Public Radio (NPR).  In the show, the Fresh Air host interviews Michael Lewis about his book Flash Boys.  The show caught my attention.   The Fresh Air website states the following,
"Flash Boys is about the form of computerized transactions known as high-frequency trading, in which the fastest computers with the highest connection speeds get the information first, and make the trade before anyone else can. A millisecond — even a nanosecond — can make all the difference between how much money is made or lost on any transaction."

With machines making trade decisions in a few seconds,  investment banks, hedge funds are exploiting these types of legal cheating using technology.

I believe US law enforcement agencies like the Federal Bureau of Investigation (FBI), US Securities and Exchange Commission (SEC), and others are looking into these types of legal cheating using technology.

Major League Baseball (MLB) punished the cheating team, but I believe they are of the tip of the iceberg.  With financial institutions using machine learning, artificial intelligence, and other technologies to legally cheat, we, as a society, need to develop policies based on data ethics.

I, however, haven't read much about data ethics in any major publications like Harvard Business ReviewMedium, Gartner, Forrester, and others.  These publications still promote the objectivity of data-driven decisions since it enables organizations to make ideal decisions.  With high-frequency trading,  stealing baseball signs, and others,  I would like to read about how businesses and organizations are promoting data ethics with their workforce.  The domain of Data Ethics should include data stealing,  skewing the data to achieve desirable results, data sharing, speedy data access, and more.

If we don't have penalties for cheating with technology and data, then we may be cheated out of our money.  Companies sharing our personal data for monetary gains is just the being.  Folks need to know what is right and wrong when these individuals work with data and analytics.  We cannot afford to have our trusted institutions to be like the 2017 Houston Astros.

NOTE: The ABCDEF image is from https://www.123rf.com/photo_10483641_flip-clock-letters-a-b-c-d-e-f.html.

Friday, January 3, 2020

Thank you for the 2010s and looking forward to the 2020s.

Happy New Year!!!  As we look forward to the new decade,  we should reflect on the past decade.  

Content-driven services
A lot happened in the 2010s,  Netflix, Amazon Prime, ESPN 3 made streaming content "cool."  I went this route because my DVDs were getting scratched, and I was not happy with my cable service.  Overall I saved money, and I was pleased with its flexibility. I could watch content on my handheld devices.  

Prediction for the 2020s:  

  • I envision these types of streams will continue to grow, and the cable TV will eventually disappear.  Folks will pay for recorded and live content rather than specific tv channels.  I would rather directly pay the NBA, NFL, MLB, and NHL rather than go through ESPN and FOX sports.  
  • I envision broker companies that will focus on aligning users, content providers, and marketing companies.  The broker companies will develop user profiles based on patterns.  They will use  AI and machine learning to build and improve user profiles.  These profiles will then leveraged to provided target adds.  Amazon is already taking steps to go this route via Amazon Channels[1].  
Technical Debt

Currently, Chief Information Officers (CIO) are struggling with aging infrastructure and the lack of funds to upgrade the infrastructure, which includes network infrastructure, databases, applications, and other IT-related components.  Hackernoon.com does a great job of defining Technical Debt[2], and Technical Debt is one of the significant causes of cybersecurity vulnerabilities.

Prediction for the 2020s:  
  • Companies will be forced to address this problem in the 2020s because of security vulnerabilities that be in exposed in the old infrastructure.
  • I envision start-up companies that will specialize in upgrading infrastructure, which includes moving digital assets (e.g., applications, data, and others) to the cloud. These companies will bring processes like DevSecOps[3] and the associated tools to the project.
  • These companies will need to provide a holistic approach in assessing the issues, developing a strategy, and executing the plan. 
Leverage platform services

In the 2010s,  companies recognized that they need more enterprise IT services. Email isn't the only enterprise service. Services like Identity and Access Management (IAM) services, enterprise search,  document management, and others need to be considered as well. Companies bought the infrastructure and hired the staff to develop the service and maintain it.  Unfortunately, a document management SME doesn't have the bandwidth to understand the rapidly evolving cybersecurity attacks.  This leaves the businesses behind the eight-ball when it comes to sustaining an enterprise service and be vigilant of cyber threats.

Prediction for the 2020s:  
  • Large vendors like Google, Facebook, Microsoft, and others will offer cloud-based enterprise services.  These services will have simple and clean interfaces.  By having a clean and straightforward interface,  the potential targets for a cyber attack are reduced. The current set of examples of cloud-based enterprise services are the Google Identity Platform [4], Tableau Online [5], and others. 
  • I envision IT groups focus on training their staff on how to integrate their business applications with their enterprise services.  

CXO roles

Currently, the Chief Information Officer (CIO) is the chief IT officer in any business organization.  Typically the Chief Information Security Officer (CISO) and the Chief Data Officer (CDO) report to the CIO.  I envision these roles will change in the future.

Prediction for the 2020s:
  • Due to the continuous threat of cyber attacks, the role of the CISO will be elevated in the business organization.  The CISO will be equivalent to the CIO.  
  • The job of the CDO will continue to be a tough one[6].  The CDO role will not be funded well and may not have a lot of authority.  The CDO will need to please two masters, the CDO and CISO.
  • The CDO role will follow a similar pattern to the Gartner Hype Cycle [7], where the hype will go down because the CDO role is not an operations role.  The CDO will be on the same level as the Chief Technology Officer (CTO) or lower.
  • The Cloud Architect will be elevated to the C-Suite due to the complexity of the cloud and the cybersecurity threats. On December 30th, 2019, The Wallstreet Journal published an article, Ghosts in the Clouds: Inside China’s Major Corporate Hack[8], which discusses how the Chinese hackers went in through the cloud providers.

[1] Casey, H., and Reisinger, C. (2019, May 13). What Is Amazon Channels and Is it Worth It? Tom's Guide. Retrieved from https://www.tomsguide.com/us/amazon-channels-faq,review-4125.html.
[2] Hackernoon.com. (2018, Jan. 25). There are 3 main types of technical debt. Here’s how to manage them. Retrieved from https://hackernoon.com/there-are-3-main-types-of-technical-debt-heres-how-to-manage-them-4a3328a4c50c.
[3] RedHat, Inc., (n.d.). What is DevSecOps?  Retrieved from https://www.redhat.com/en/topics/devops/what-is-devsecops.
[4] Google, LLC (n.d.). Google Identity Platform. Retrieved from https://developers.google.com/identity
[5] Tableau, LLC (n.d.). Tableau Online. Retrieved from https://www.tableau.com/products/cloud-bi.
[6] Bennett, Jo. (2016, Apr. 11). How chief data officers can tackle formidable roadblocks, including people, culture, and internal resistance. Smarter With Gartner. Gartner, Inc.  Retrieved from https://www.gartner.com/smarterwithgartner/half-of-cdos-succeed/.
[7] Gartner, Inc. (n.d.). Gartner Hype Cycle.  Retrieved from https://www.gartner.com/en/research/methodologies/gartner-hype-cycle.
[8] The Wallstreet Journal (2019, Dec. 30). Ghosts in the Clouds: Inside China’s Major Corporate Hack. Retrieved from https://www.wsj.com/articles/ghosts-in-the-clouds-inside-chinas-major-corporate-hack-11577729061.

Friday, December 27, 2019

Data generated by its citizens as a national asset...

The following statement in a Harvard Business Review (HBR) article, which was published on December 18, 2019, caught my eye,
"India is a nation state, it would treat the data generated by its citizens as a national asset, store and guard it within national boundaries, and reserve the right to use that data to safeguard its defense and strategic interests."[1]  The HBR article is about India's proposed legislation to protect its consumer data. The bill is called the Personal Data Protection Bill (DBP).

On December 24, 2019, The New York Times published an article called Pentagon Warns Military Personnel Against At-Home DNA Tests.[2] The article discusses a Department of Defense internal memo which discourages military personnel from taking mail-in DNA tests.  According to DoD leadership, they:
  • Are unreliable;
  • Negatively impact service members' careers; and 
  • Create security risks.
I am not sure if there is any relationship between the two articles, but it highlights the dangers of international privacy laws, which are still evolving.  I believe our DNA information is our most private and personal information.  If countries like India push laws like the DBP,  which can use its citizens' personal information, including DNA for national and strategic interests, then folks to be careful about their personal data.  Nations cannot use their citizens' data if the data isn't generated.

The next question to ask if these laws will get less complicated.   The answer lies in the history of privacy laws.

According to the HBR article's authors, the proposed DBP is based on the European Union's (EU) General Data Protection Regulation (GDPR).  According to an EU website, the GDPR was released on May 24, 2016, and the EU enforced the GDPR on May 25, 2018.[3] The EU used the US Department of Commerce's National Institution of Standards and Technology (NIST) recommendation called Guide to Protecting the Confidentiality of Personally Identifiable Information (PII) (Special Publication 800-122)[4], which was released in April 2010.  The state of California issued the California Consumer Privacy Act (CCPA) in June 2018.  The CCPA is based on GDPR with modifications.[5]

With each data regulation, bill, and recommendation released over time, nations and states are trying to protect its citizens and their data, but countries (like India, China, and others) may use the data to further political and national security reasons.

As members of the digital age, we cannot cut off ourselves from the internet, but we need to be aware of the risks, policies, and rights since they are constantly evolving like the technologies around us.  Military and other governments may ask their citizens to share their DNA for the good of their countries. Still, I am not a fan of businesses monetizing, or criminal syndicates stealing my data. I believe privacy laws, in general, are good, but we in the US need a cohesive single data protection legislation and a watchdog organization that focuses on digital privacy.  I am not a policy wonk, but I believe national and international privacy laws need to mature as data grows exponentially, and technologies like artificial intelligence proliferate our lives in the next ten years.

Until then, please be careful what you share on the web, including your DNA results.  Currently, our DNA information in the US is protected via the Genetic Information Nondiscrimination Act of 2008 (GINA) [6]. That being said, presidents change, legislators change, supreme government justices change, governments may change, but your DNA information doesn't change.

[1] Govindarajan, V., Srivastava, A., & Enache, L. (2019, Dec., 18). How India Plans to Protect Consumer Data. Retrieved from https://hbr.org/2019/12/how-india-plans-to-protect-consumer-data.
[2] Murphy, H., & Zavehri, M. (2019, Dec. 24). Pentagon Warns Military Personnel Against At-Home DNA Tests. The New York Times. Retrieved from https://www.nytimes.com/2019/12/24/us/military-dna-tests.html.
[3] Data protection in the EU | European Commission.  Retrieved from https://ec.europa.eu/info/law/law-topic/data-protection/data-protection-eu_en.
[4] McCallister, E., Grance, T., & Scarfone, K. (2010, Apr.). Guide to Protecting the Confidentiality of Personally Identifiable Information (PII). NIST Special Publication 800-122. Retrieved from https://csrc.nist.gov/publications/detail/sp/800-122/final.
[5] Korolov, M. (2019, Oct. 4). California Consumer Privacy Act (CCPA): What you need to know to be compliant. Retrieved from https://www.csoonline.com/article/3292578/california-consumer-privacy-act-what-you-need-to-know-to-be-compliant.html.
[6] The Genetic Information Nondiscrimination Act of 2008 (2008, May 21). U.S. Equal Employment Opportunity Commission (EEOC). Retrieved from https://www.eeoc.gov/laws/statutes/gina.cfm.
[7] Picture for this blog post retrieved from https://pixabay.com/illustrations/dna-matrix-genetics-control-3888228/.

Friday, December 20, 2019

"Dear Santa" letter from a vigilant kid

Dear Santa,

I cannot send you my Christmas letter because the letter will have a lot of my personal data aka personally identifiable information (PII) data. Due to privacy laws and potential stalkers, my parents forbade me to send you a letter or an email. With the various cybersecurity issues, I am not sure if my email will end up in the wrong hands and my daddy, mommy, brother, sister and I may get socially engineered and phished.

Since I still believe in you, here is a compromise. I will share my Amazon wish list and you can use it to send me toys. It needs to be magically because my parents forbade me to share my address with you. I noticed that now Amazon recommends items that I can add to my Amazon wish list. I wonder if Amazon is learning my behavior on Amazon.com. Doesn't that compromise my privacy? Mommy and Daddy tell me privacy laws are here to protect me but I am not sure if websites like Amazon.com, Google.com, Yahoo.com should use my data to make money.

Nevertheless, please send me your email address and I will share my Amazon wish list.

Merry Christmas Santa,

Your Friend

P.S.  My parents forbade me to share my name with you.

NOTE: I used the picture from https://spaceshipsandlaserbeams.com/20-free-printable-letters-to-santa/.

Sunday, December 15, 2019

Hack the LIME - the story about open-source AI

Leave it to a lawyer to spoil the fun!  Andrew Burt, a lawyer, and writer, recently wrote an article for the Harvard Business Review, which discusses the risks of open-source Artifical Intelligence (AI) models   The HBR post, "The AI Transparency Paradox," explains the dangers of making artificial intelligence models transparent.  On the one hand, businesses want to know how companies engineered their AI algorithms.  Is the AI algorithm too sexist, racist, and just not good?  For instance, in James Vincent's article, Google ‘fixed’ its racist algorithm by removing gorillas from its image-labeling tech, Google "fixed" its 2015 deep learning algorithm, which labeled humans as monkeys. I am sure the appropriate stakeholders at Google wanted to know how the algorithm was written and how can the developer fixed the inappropriate bias in the algorithm.

On the one hand, if Google opened its AI code to the open-source community, then the AI algorithms could be significantly improved.   On the other hand, as (our no-fun lawyer friend), Andrew Burt cautions businesses that they would be vulnerable if they opened their AI algorithms to the world, which includes bad actors like hostile governments.  Andrew uses the research paper titled, “Why Should I Trust You?” Explaining the Predictions of Any Classifier.  The paper is about the Local Interpretable Model-Agnostic Explanations (LIME) algorithm.  Andrew then talks about a research paper. How can we fool LIME and SHAP? Adversarial Attacks on Post hoc Explanation Methods,  which was published in November 2019.  The document discusses a "novel scaffolding technique," which can be used to hack the LIME algorithm.  Is there a return on investment if companies made their algorithms more transparent?

Folks may argue that Google successfully made its TensorFlow machine learning (ML) platform open source without much impact.  Before the TensorFlow platform question can be answered,  the following terms need to be defined:
  • Machine Learning;
  • Deep Learning; and 
  • Artificial Intelligence.
According to Geeks for Geeks portal, "Machine Learning is a technique of parsing data, learn from that data, and then apply what they have learned to make an informed decision." Machine Learning includes supervised training where the machine builds the model on most of the data (70% to 90% of the complete dataset) and then test the data for accuracy with the remaining portion of the data.  Unsupervised training involves machine learning from the full dataset.  There is no testing of the data.

Deep Learning algorithms are a specific subset of Machine Learning algorithms, which are primarily composed of neural networks. A common use of deep learning algorithms is for image recognization.

Gartner defines Artificial intelligence (AI) is the application of "...advanced analysis and logic-based techniques, including machine learning, to interpret events, support and automate decisions, and take actions."

Gartner recommends several open-source Data Science and ML platforms, including TensorFlow, to develop ML-based solutions.    Does this mean Mr. Burt is incorrect?

Andrew Burt is correct because comprised algorithms, which are the core of all AI solutions, will cause businesses to make poor decisions and ultimately discredit the AI solution. The underlying technology, which enables to invoke the algorithm, pull the data, process the data, and reports the outcomes, can be open source.  The other critical piece of any AI solution is the data itself.   Sensitive data is never unless asked by law enforcement or regulatory agency.

In conclusion,  Andrew Burt is correct by bringing up valid points, but at the same time, the AI field is still pretty young.  I am a strong believer in open-source software, and I believe we will have open-source AI models available.  We will also have AI models that will validate other models using technologies like blockchain technologies.

NOTE: The video below is about the LIME algorithm, and please keep in mind researchers recently hacked this algorithm.