So the next thing to consider related to teams are the broader considerations around regulatory governance and data stewardship. At the very least, there should be a plan for medical data stewardship that everyone involved in the project can agree to. >> It's critical that all members of the team be trained and follow strict best practices when working with healthcare data even if de-identified, since a breach or leakage of the data can be catastrophic to the project. It's common for there to be a culture shock when experienced data scientist from other domains are newly a part of the healthcare related project. These issues can be exacerbated in situations where healthcare machine learning projects are being performed or led by consumer technology companies whose culture of data sharing practices are often not subject to the same privacy protecting regulations and codes of conduct as those of medical researchers. >> While it may seem normal to anyone in healthcare research that medical data is very different from other types of data like stock market data, real estate prices, banking data. It's often not a foregone conclusion for many outside healthcare research that medical data privacy has very personal implications beyond even personal banking and financial data. After all medical data, if used improperly could damage an individual's psychological well being, as well as their employability and insurability, and have life altering consequences. Examples of this may include the result of an HIV test or genetic susceptibility testing. >> So proper training up front is recommended for those who are new to working with healthcare data. That these data represent information that may be regarded as sensitive or consequential, disclosed only in the context of intimate personal relationships or professional doctor patient relationships. Usage of the data outside of the practice of caring for a patient, for example in a machine learning project challenges these expectations. So this will help ensure that everyone involved in the project has had at least basic medical research and data stewardship training. And can help avoid the context transgressions that can occur in collaborations or partnerships where data may flow between medical, social, and commercial contexts governed by different privacy norms. >> Of course, medical data is a big business in some systems notably the US, and patient's electronic health records which are de-identified and aggregated with other patient data are commonly bought, sold, and shared at will with little to no notice, and with questionable consent from patients. This is largely because de-identified data are not considered personal health information despite studies showing that it is fairly easy to re-identify an individual from a given data set. The consent process often takes place at inappropriate times in healthcare when a patient has little ability to opt-out of a data sharing agreement. It can be on one line of an embedded huge hospital admissions form stating that a hospital owns the rights to all data generated about a patient from care, tissue collection, etc. In perpetuity >> When curating large medical data sets for clinical machine learning applications, it's important to ensure that the data will not be used in a way that could cause harm or is un-ethical. Which could be intentional or unintentional, such as denying care to individuals without insurance, or reducing therapeutic options based on a model prediction. Members of the development team and ecosystem must share a common understanding of the ethical and regulatory issues around the development and use of clinical machine learning tools. >> And let's also mention the overall team composition. When selecting members of a team, it's critically important to consider how the project will promote equitable clinical machine learning solutions that does not suffer from bias that either leaves out, or worse harms certain societal populations. Medicine in general has a poor track record for inclusion in many clinical studies, and many common recommendations in health care are based on small populations that are homogenous and tending to skew towards Caucasian males. >> Working to ensure that new clinical machine learning applications are free from biases can include the composition of the team. And forming a team that is diverse with respect to gender, culture, race, age, ability, ethnicity, sexual orientation, socio-economic status, privilege, etc. This is a challenge in many machine learning teams because there are a few industries less diverse than computer science and engineering. It's been repeatedly documented that the lack of diversity in the machine learning field has led to flawed systems that exacerbate gender and racial biases in the resulting machine learning models. A popular related example is the very first Apple HealthKit, which enabled incredibly specialized healthcare tracking features for users including everything from laboratory results, but did not include menstrual cycle. This has been attributed to the fact that the original development team for this project did not include any woman at all. This has since been remedied and menstrual cycle is also later added, but it's a useful reminder of the importance of diversity on machine learning teams. >> The development of healthcare AI tools requires a diverse team including information technologist, data scientists, ethicists, lawyers, clinicians, patients, and clinical teams. And organizations collaborate and prioritize governance structures and processes. These teams will need a macro understanding of the data flows transformation incentives, levers, and frameworks for algorithm development and validation, as well as a knowledge of ongoing changes required post implementation. The likely substantial infrastructure costs for development and deployment of machine learning application solutions could leave smaller resource constrained healthcare systems and the patients they serve at a disadvantage. Consider machine learning tools that if successful, can achieve significant cost reductions or improve patient outcomes and create further competitive advantage which can exacerbate existing healthcare disparities. >> Best practices should be driven by an implementation science research agenda and should engage stakeholders with an effort to reduce cost and complexity of AI technologies. This is particularly important for smaller healthcare systems, many of which are in rural and resource constrained environments. In summary, the best clinical machine learning outcomes will come from team based approaches, composed of individuals with complementary skill sets, essential expertise, and diversity of backgrounds. >> So get your team together and see what can happen.