{"ID":78054,"post_author":"9208550","post_date":"2018-12-14 13:57:24","post_date_gmt":"0000-00-00 00:00:00","post_content":"","post_title":"LIMSjournal - Spring 2018","post_excerpt":"","post_status":"draft","comment_status":"closed","ping_status":"closed","post_password":"","post_name":"","to_ping":"","pinged":"","post_modified":"2018-12-14 13:57:24","post_modified_gmt":"2018-12-14 18:57:24","post_content_filtered":"","post_parent":0,"guid":"https:\/\/www.limsforum.com\/?post_type=ebook&p=78054","menu_order":0,"post_type":"ebook","post_mime_type":"","comment_count":"0","filter":"","_ebook_metadata":{"enabled":"on","private":"0","guid":"E3034011-EB82-405D-82ED-8EF4117D5E78","title":"LIMSjournal - Spring 2018","subtitle":"Volume 4, Issue 1","cover_theme":"nico_7","cover_image":"https:\/\/www.limsforum.com\/wp-content\/plugins\/rdp-ebook-builder\/pl\/cover.php?cover_style=nico_7&subtitle=Volume+4%2C+Issue+1&editor=Shawn+Douglas&title=LIMSjournal+-+Spring+2018&title_image=https%3A%2F%2Fs3.limsforum.com%2Fwww.limsforum.com%2Fwp-content%2Fuploads%2FFig3_Scotti_Molecules2018_23-1.png&publisher=LabLynx+Press","editor":"Shawn Douglas","publisher":"LabLynx Press","author_id":"26","image_url":"","items":{"b94dc07071fd3149fbecd75f93d73558_type":"article","b94dc07071fd3149fbecd75f93d73558_title":"Big data and public health systems: Issues and opportunities (Rojas de la Escalera and Carnicero Gim\u00e9nez de Azc\u00e1rate 2018)","b94dc07071fd3149fbecd75f93d73558_url":"https:\/\/www.limswiki.org\/index.php\/Journal:Big_data_and_public_health_systems:_Issues_and_opportunities","b94dc07071fd3149fbecd75f93d73558_plaintext":"\n\n\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\n\n\t\t\t\tJournal:Big data and public health systems: Issues and opportunities\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tFrom LIMSWiki\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\tJump to: navigation, search\n\n\t\t\t\t\t\n\t\t\t\t\tFull article title\n \nBig data and public health systems: Issues and opportunitiesJournal\n \nInternational Journal of Interactive Multimedia and Artificial IntelligenceAuthor(s)\n \nRojas de la Escalera, David; Carnicero Gim\u00e9nez de Azc\u00e1rate, JavierAuthor affiliation(s)\n \nSistemas Avanzados de Tecnolog\u00eda, Health Service of NararrePrimary contact\n \nEmail: javier dot carnicero dot gimenez at cfnavarra dot esYear published\n \n2018Volume and issue\n \n4 (7)Page(s)\n \n53\u201359DOI\n \n10.9781\/ijimai.2017.03.008ISSN\n \n1989-1660Distribution license\n \nCreative Commons Attribution 3.0 UnportedWebsite\n \nhttp:\/\/www.ijimai.org\/journal\/node\/1629Download\n \nhttp:\/\/www.ijimai.org\/journal\/sites\/default\/files\/files\/2017\/03\/ijimai_4_7_8_pdf_16011.pdf (PDF)\n\nContents\n\n1 Abstract \n2 Introduction \n\n2.1 The health system \n2.2 The health cluster or ecosystem \n2.3 Challenges faced by the health system \n2.4 The transformation of public health systems \n\n\n3 Big data solutions in a healthcare environment \n\n3.1 The health information system \n3.2 Potential contributions of big data to health systems \n3.3 Requirements for the use of big data within the health information system \n3.4 Some additional issues and barriers \n\n\n4 Big data in SERGAS: A case study \n5 Conclusions \n6 Acknowledgements \n7 References \n8 Notes \n\n\n\nAbstract \nIn recent years, the need for changing the current model of European public health systems has been repeatedly addressed, in order to ensure their sustainability. Following this line, information technology (IT) has always been referred to as one of the key instruments for enhancing the information management processes of healthcare organizations, thus contributing to the improvement and evolution of health systems. On the IT field, big data solutions are expected to play a main role, since they are designed for handling huge amounts of information in a fast and efficient way, allowing users to make important decisions quickly. This article reviews the main features of the European public health system model and the corresponding healthcare and management-related information systems, the challenges that these health systems are currently facing, and the possible contributions of big data solutions to this field. To that end, the authors share their professional experience on the Spanish public health system and review the existing literature related to this topic.\nKeywords: big data, health system, healthcare organizations, health information systems, epidemiological surveillance, strategic planning\n\nIntroduction \nThe health system \nAccording to the World Health Organization (WHO), \u201ca health system consists of all organizations, people and actions whose primary intent is to promote, restore or maintain health. This includes efforts to influence determinants of health as well as more direct health-improving activities. A health system is therefore more than the pyramid of publicly owned facilities that deliver personal health services.\u201d[1] Furthermore, every health system performs the following set of basic functions[2]:\n\n delivering health services to individuals and to populations\n creating resources\n providing stewardship\n financing the system\nThe center of any health system must be the first of these functions, since healthcare constitutes the paramount goal and therefore the reason for the existence of the health system itself. Around it, other functions are organized, essential for ensuring healthcare delivery and public health. Among these, the following must be remarked upon:\n\n epidemiological surveillance, which comprises the collection and analysis of large volumes of data directly or indirectly related to people\u2019s health, so as to detect or prevent possible health problems regarding public health\n planning and overseeing the management of the health system, which allows healthcare organizations to set out their strategic goals, allocate the necessary resources, assess the degree of compliance of these goals and apply corrective measures if required\n clinical research, focused on generating knowledge and applying it to the development of new diagnostic and therapeutic techniques\n education and teaching, in order to train new professionals and keep the practicing ones appropriately updated and competent\nThe health cluster or ecosystem \nFrom a structural point of view, a health system is neither an isolated nor homogeneous entity, but rather it comprehends or relates to entities of diverse nature, both public and private, with interests of their own, as well as shared interests. This ensemble is known as health cluster or ecosystem, and among its components the following must be pointed out[3]:\n\n central or federal government and regional or local authorities \n healthcare services, conceived as organizations responsible for the management of a determined healthcare network \n hospitals\n primary care centers\n emergency services\n pharmacies\n convalescent centers\n health professionals acting as external providers to the health system\n public health services\n insurance companies, mutual societies, and other entities which finance healthcare\n schools for the education and training of doctors, nurses, and other health professionals\n research centers\n professional associations and colleges\n foundations and learned societies\n stakeholders, such as patients associations \n pharmaceutical and other health technology industries\nChallenges faced by the health system \nFor decades, the public health systems of European countries, created following the end of World War II, have been frequently mentioned as a reference model to be followed, especially in those aspects regarding coverage, quality of service, and contribution to the welfare of society. However, the scene in which these systems arose has suffered a series of major changes, being the most important the following ones[4]:\n\n the aging of the population, with a continuous increment of chronic and degenerative diseases\n the financial crisis, which causes important budget cuts in the public funds meant to finance the health systems activities, and makes it more difficult\u2014or even impossible\u2014for the citizens to compensate these cuts with out-of-pocket expenses\n the creation of new techniques and drugs, more effective but also more expensive, mainly due to the necessity to compensate the research costs caused by their development\n the increasing demands of the citizens, who require more and better healthcare services in a setting that seeks patient empowerment and promotion of personalized medicine\nAs a token of the first two determinants, aging of the population and public budget cuts, the Spanish case is addressed below. Table I shows the progress of these two indicators during the period between 2003 and 2014.\n\n\n\n\n\n\n\nTable 1. Demographics and health expenditure in Spain (2003\u20132014)\r\nSources - demographics: Demographic Information System, Spanish National Statistics Institute (INE); health expenditure date: OECD Health Statistics\n\n\nYear\n\nTotal population\n\nPopulation 15\u201364 years\n\nPopulation older than 64 years\n\nDependency ratio\n\nPublic health expenditure\n\nPrivate health expenditure\n\n\nPeople\n\n% over total\n\nM\u20ac\n\n% GDP\n\nM\u20ac\n\n% GDP\n\n\n2003\n\n42,717,064\n\n29,396,965\n\n7,276,620\n\n17.03%\n\n24.75%\n\n43,158.4\n\n5.37%\n\n17,354.5\n\n2.16%\n\n\n2004\n\n43,197,684\n\n29,777,965\n\n7,301,009\n\n16.90%\n\n24.52%\n\n46,992.4\n\n5.46%\n\n18,651.1\n\n2.17%\n\n\n2005\n\n44,108,530\n\n30,511,110\n\n7,332,267\n\n16.62%\n\n24.03%\n\n51,351.5\n\n5.52%\n\n20,094.2\n\n2.16%\n\n\n2006\n\n44,708,964\n\n30,849,177\n\n7,484,392\n\n16.74%\n\n24.26%\n\n56,662.2\n\n5.62%\n\n21,520.7\n\n2.14%\n\n\n2007\n\n45,200,737\n\n31,188,079\n\n7,531,826\n\n16.66%\n\n24.15%\n\n61,612.0\n\n5.70%\n\n23,101.9\n\n2.14%\n\n\n2008\n\n46,157,822\n\n31,869,008\n\n7,632,925\n\n16.54%\n\n23.95%\n\n68,147.1\n\n6.11%\n\n24,392.9\n\n2.19%\n\n\n2009\n\n46,745,807\n\n32,145,023\n\n7,782,904\n\n16.65%\n\n24.21%\n\n73,035.6\n\n6.77%\n\n23,863.0\n\n2.21%\n\n\n2010\n\n47,021,031\n\n32,153,527\n\n7,931,164\n\n16.87%\n\n24.67%\n\n72,852.6\n\n6.74%\n\n24,593.7\n\n2.28%\n\n\n2011\n\n47,190,493\n\n32,082,758\n\n8,093,557\n\n17.15%\n\n25.23%\n\n71,800.0\n\n6.68%\n\n25,510.2\n\n2.38%\n\n\n2012\n\n47,265,321\n\n31,980,402\n\n8,222,196\n\n17.40%\n\n25.71%\n\n68,262.9\n\n6.47%\n\n26,594.3\n\n2.55%\n\n\n2013\n\n47,129,783\n\n31,718,285\n\n8,335,861\n\n17.69%\n\n26.28%\n\n65,718.5\n\n6.26%\n\n26,981.3\n\n2.62%\n\n\n2014\n\n46,771,341\n\n31,281,943\n\n8,442,427\n\n18.05%\n\n26.99%\n\n65,975.7\n\n6.34%\n\n28,558.1\n\n2.74%\n\n\n\nThese data reveal that the Spanish population has increased from 42.72 to 46.77 million people during the 2003-2014 period, while the percentage of people older than 64 years has risen from 17.03% to 18.05% over total population, and the dependency ratio, which indicates the ratio between population older than 64 years and population between 15 and 64 years old, has risen from 24.75% to 26.99%. On the other hand, public health expenditure in 2003 meant 5.37% GDP, reaching a peak of 6.77% in 2009 and falling to 6.26% in 2013, experiencing a small recovery in 2014, with 6.34% GDP. Regarding private health expenditure, it was at minimums around 2.14%-2.17% GDP but it has risen year after year since the beginning of the financial crisis in 2008, reaching 2.74% GDP in 2014. \nAll things considered, the impact of all these determinants is so important that the sustainability of this model of public health system has been questioned in recent years.\n\nThe transformation of public health systems \nDespite the fact that the challenges explained above make clear that a deep transformation of this health system model is needed, and IT is often considered as one of the main facilitators for this change, it is not admissible to think that health systems are going to lose their essential features. Health systems must improve people\u2019s health, from both an individual and a collective point of view, and this final goal will not change in spite of the introduction of new technologies impacting big data. \nThe patient must always be the center of any health system and, in the same way, health information must always be the center of a health information system, which will be introduced below. The actions of a clinic professional focus on the achievement of specific healthcare goals customized for each one of their patients, improving or maintaining their health status. Besides knowledge, healthcare requires a connected and personalized relationship between the provider and the patient, so that interventions are tailored to the patient\u2019s unique preferences and behavior, as with, for instance, drug adherence. Different people will have different reasons for non-adherence.[5]\nOn their behalf, health systems managers must seek the compliance of the general goals defined by their organizations. These goals will be the aggregate of the individual goals related to each one of the professionals in their clinical staff. In addition, these managers will also be responsible for the allocation of the necessary resources and the financing of the whole activity in their organizations.\nOn the whole, health systems must focus their efforts on the creation of value for both the patient and society. To that end, clear goals must be defined that find an appropriate balance between the patient\u2019s personal interests and the collective interest of society. For instance, in the event of a surgical intervention it is mandatory to measure some indicators such as mortality rate, adverse events, time of recovery, care costs, or time for the patients to return to their jobs at full capacity. Nevertheless, it is necessary to also take into account other indicators, maybe more subjective and thus harder to measure, but equally important because of their impact on the patient, such as post-surgery functionality, pain suffered, or the cost of all these factors from a quality-of-life point of view. If health systems are not focused on their patients\u2019 interests and on achieving the corresponding goals, they will hardly be able to change and ensure their sustainability.[6]\nThis article reviews the main features of the European public health system model and the corresponding healthcare and management-related information systems, the problems and challenges that these health systems are currently facing, and the solutions that big data tools may potentially offer in that respect. To that end, the authors have based this work on their professional experience with the Spanish public health system, an analysis of the scene that the latter is facing in the upcoming years and decades, and a review of the existing literature on big data applied to health.\n\nBig data solutions in a healthcare environment \nThe health information system \nThe individual performance of the different components of the health cluster, as well as their interactions, causes the creation of multiple data flows, which are also greatly varied, since they involve several business processes. This set of data flows creates the health information system as a result. \nAs mentioned above, just as healthcare is the center of any health system, patient-related healthcare information must be the center of the health information system as well, since it may and will also be used for activities other than healthcare, such as epidemiological surveillance, planning, overseeing of management, clinical research, and education and training, as stated in the introduction. The fact that these data are stored in healthcare information systems is a consequence of them being generated during the patient\u2019s care, but their usefulness goes clearly beyond this limit. \nTherefore, the health information system must allow users to register, process, consult and share large amounts of data, ensuring their availability at the appropriate moment and point of the health cluster. On the healthcare side, this cluster reveals itself as a huge generator and at the same time consumer of enormous sets of information, related to personalized healthcare processes that take place on a daily basis and in a massive way. Healthcare is considerably intense regarding data treatment, by constantly creating immense datasets and frequently requiring access to knowledge sources.\nFor IT to be fully integrated in the health system value chain, it is mandatory to have a health information system which serves as an instrument for knowledge management being useful to all its users. Healthcare professionals cannot perform their duties properly without registering and using patients\u2019 information, or without accessing the knowledge sources that allow them to make decisions on a solid basis. Public health departments need to know the population health status in order to detect or prevent potential collective health issues, as well as defining the necessary corrective and preventive measures. To those ends, these professionals must rely on data generated during every patient\u2019s healthcare encounter, properly aggregated, as well as other data sources.\nManagers are not able to plan a strategy, oversee its performance, and assess the achieved outcomes without a tool that allows them to process all the necessary information and provides them with accurate data, timely and in due form. These data are required from the very beginning, since the definition of an appropriate strategy must be based on the knowledge of the population\u2019s health status, complemented with projections of its potential progress.\nThis complexity has been increased in recent years by a major change in healthcare organizations, which have evolved from a clearly paternalist way of interaction with their patients to a completely different one, focused on seeking their empowerment. In addition, patients are not content anymore with the information provided to them by their doctors, but search the internet for additional data about their diseases, engage on social networks, make their own decisions, and register information on their health records. This new role is indeed required if, for instance, health authorities seek to promote one of the most important lines of action in the field of chronic patient management, self-care encouragement, which has a beneficial impact on both the patient and the health system. However, this requires also a more varied interaction between them, combining traditional simple events, like setting up appointments with a general practitioner, with more complex actions, like monitoring health data analyzed and stored by wearable devices.\nDespite this, it is clearly positive that both society and the medical community have evolved from a discussion about giving patients clearance to access their own healthcare information, to a totally different one about seeking the best way for the patients to register data in their health records, either in an active and conscious way or in an passive and automated one via specific devices. In any case, it must be always taken into account that the management of healthcare information is not a process unrelated to healthcare but rather an inseparable component of healthcare itself, hence its management and supervision are the healthcare professionals\u2019 responsibility, even though the patients take a more active role. Furthermore, every professional must accept this new reality and provide the patient with the necessary training so that this initiative ends up being successful.[7]\nApart from this, the temptation of exploiting the information stored on different social networks turns out to be very powerful. It is true that, to the health system, it is a possibility worth exploring, but several conditioning factors must also be considered. The first one is data protection as a consequence of people\u2019s right to privacy, something that, from the beginning, seems to collide clearly with the business model of social networks themselves, designed to share large amounts of information in a quick, heterogeneous, and, up to some point, uncontrolled way. \nPrecisely these features represent another important conditioning factor, since social networks are nothing but huge repositories that store unstructured, poorly classified, or simply uncategorized data, not to mention the more than likely irrelevance of most of them regarding healthcare and, moreover, their doubtful veracity, a feature essential to this field. Given their market penetration, with millions of users around the world, it seems advisable to assess the possibility of using social networks as an information source for health systems, as long as a model can be defined that solves or at least mitigates all the inconveniences mentioned above.\n\nPotential contributions of big data to health systems \nThe field of big data analytics is rapidly expanding, up to the point that it has begun to play a main role in the evolution of healthcare practices and research, by providing tools to register, manage, and analyze huge amounts of both structured and unstructured data produced by current healthcare information systems.[8]\nHealth-related big data streams can be classified into three categories[5]:\n\n Traditional healthcare data are generated within the health system and stored in datasets such as health records, medical imaging tests, lab reports, or pathology results, among others. Analyzing this information allows to achieve a better understanding of disease outcomes and their risk factors, and also to reduce health system costs, thus making them more efficient.\n \u201cOmics\u2019\u2019 data deal with large-scale datasets in the biological and molecular fields, such as genomics, microbiomics or proteomics, for instance. The study of this information leads to deeper knowledge about how diseases behave, accelerating the individualization of medical treatments.\n Data from social media allow to figure out how individuals or groups use the internet, social media, apps, sensor devices, wearable devices or any other tools to better inform and enhance their health.\nAdditionally, the inclusion of geographical and environmental information may further increase the ability to interpret gathered data and extract new knowledge.[9][10]\nCombinations of several types of data must also be taken into account. The concept of personalized medicine, partially introduced above, seeks to combine the patient\u2019s health record and genomic data in order to support the clinical decision-making process, making it predictive, personalized, preventive and participatory, an idea known as \u201cP4 Medicine.\u201d[11]\nAt the micro level, personalized medicine aims to customize the diagnosis of a disease and the subsequent therapy by taking into account the individual patient\u2019s characteristics, instead of relying on decisions taken according to general guidelines, defined as a result of population-based studies and clinical trials. This will require the integration of clinical information, mainly patient records, and biological data such as genome or protein sequences. These data are generated from different and heterogeneous sources, and they have very diverse formats.[11]\nIn fact, healthcare data no longer needs to be restricted to traditional datasets such as electronic health records. For instance, mobile or wearable devices monitoring physiological signals can provide timely access to multiple data points that are increasingly interconnected. Traditionally, the data generated by these sorts of devices have not been stored for more than a brief period of time, being discarded afterwards and therefore preventing any extensive investigation to benefit from the exploitation of these data. However, attempts to use this kind of data have been increasing lately, in order to improve patient care and management.[8][12]\nNevertheless, there is a difference between collecting data, having access to data, and knowing how it should be used to improve healthcare. Now that the technology for handling massive amounts of data is available, the next step is developing tools for information sharing and knowledge management, which are seriously limited by the lack of system interoperability.[11][12]\nFor instance, with full interoperability the ability to collect data in a timely manner from several different sources leads to an increase in registries. Disease registries are still in an early stage, but they might be valuable tools when it comes to supporting patient-centered self-management of chronic illness and defining customized treatment plans. Besides, the integration of computer analysis with appropriate care will help doctors to improve diagnostic accuracy. In a similar way, the integration of medical images with other types of electronic health record data and genomic data can also improve the accuracy of a diagnosis and reduce the time required for it.[8][12]\nA major emphasis of personalized medicine is to match the right drug with the right dosage to the right patient at the right time. Moreover, gene sequencing and the use of the subsequent genetic data in diagnosis and treatment will be essential to the future of personalized medicine, with actions such as the prescription of drugs based on genomic profiles of individual patients, known as pharmacogenomics. However, analytics of high-throughput sequencing techniques in genomics is a problem inherent to big data itself, since the human genome consists of 30,000-35,000 genes. Some ongoing projects aim to integrate clinical data from the genomic level to the physiological level of a human being. These initiatives will surely help when it comes to deliver personalized healthcare.[8][11][12]\nAt the macro level, faster access to data allows any hospital to define and apply quality improvement policies based on the constant monitoring of outcomes, ensuring that the strategic goals of the organization are achieved. Hospitals have also used electronic health records, datasets originally intended to document individual healthcare processes, to identify system-related inefficiencies and quality issues. Faster access to data has also been hugely useful for the identification and management of disease outbreaks, allowing public health initiatives to be targeted to specific areas via population analysis.[12]\nMining of electronic health record data has made it possible for researchers to identify possible sources of adverse events. Healthcare professionals have used this information to improve organizational practices and reduce error rates. Moreover, many clinical information systems such as electronic health records and computerized physician order entry systems capture a large amount of metadata about their use, which can be used for auditing purposes, thus allowing the organization to detect user-device interaction problems, shrinking safety margins and minimizing other technology-related safety issues and concerns before any adverse event takes place.[12]\nThe potential impact of big data is not easy to estimate, let alone at such an early stage. A report sponsored by the McKinsey Global Institute states that the proper use of big data within the United States healthcare sector might allow improvements with an estimated value of more than $300 billion every year, two-thirds of which would be achieved by reducing the healthcare expenditure of the whole country.[13]\nHowever, healthcare IT history has made clear that technology-based panaceas do not exist. The potential of IT for transforming health systems seems to be widely accepted as a consequence of its contribution to the improvement of healthcare processes, but IT has also caused new issues and risks, such as user-computer interaction problems or technology-induced errors. As a consequence, it seems clear that more IT outcome-based research is still needed in order not only to prove its value but also to quantify it.[12][14]\n\nRequirements for the use of big data within the health information system \nWhile the ability to manage massive amounts of data provides a huge opportunity to develop methods and applications for advanced analysis, the real value of big data will only be achieved if the information extracted from these data is useful to improve clinical decision-making processes and patient outcomes, as well as lower healthcare costs.[11] To that end, several basic requirements must be met, though they are very similar to the requirements of the health information system itself.\nFirst of all, it is essential to ensure the quality of the information. This involves the development of thorough protocols which define the criteria required for data input, validation, harmonization, registry, processing, and transmission to other components of the information system. In fact, several of the main requirements of data mining are the technical correctness of data, the accuracy and statistical performance, and the update or reassessment of the analysis.[15]\nIn the health field, the information managed is so complex and heterogeneous that it is necessary to employ data carefully structured and, as long as it is possible, categorized. This is useful for data identification and error control purposes. Furthermore, healthcare information is a perfect example of three major features, commonly known as the three Vs, widely accepted as defining characteristics of big data: volume, variety, and velocity. In addition to these a fourth V, the veracity of healthcare data, is obviously critical for its meaningful use.[8]\nAll possible information sources and data flows within the health information system must be perfectly identified as well. Since the information system must store all data required for the performance of the different corporate functions of the health system, it is clear that all of its components must be interoperable, as stated above, so that any data can be accessed from any point of the health system that needs them. Hence another cardinal requirement is the interoperability of systems, subsystems, and components, defined as their capability of exchanging information without altering the meaning of the exchanged data, regardless of their source and their use within each system. \nFor instance, a medical consultation generates information used for the patient\u2019s healthcare, the management of the employed resources, and the billing of the service, but it can also be used in the medium and long-term for outcome assessment, strategic planning, research, education, epidemiological surveillance, or even as evidence in legal proceedings. Moreover, the aggregate of every data generated during that consultation and the ones generated during millions of similar healthcare events will be useful to create knowledge, on which [[clinical decision support system]s will be based.\nTherefore the cycle comprehends the transition from data to information, from information to knowledge, and from knowledge to practice. All of this needs the interoperability of clinical information systems, logistic and economic-financial systems, business intelligence systems, and university\/R&D center systems, among others. As a consequence, every system must be capable of filtering the information received in order to extract the data it needs, so as to not compromise their processing, thus avoiding the risk of producing adulterated results. \nFinally, from a technological point of view, it is mandatory to have a high-performance IT infrastructure on which to rely for the generation, storing, processing, and exchange of large data volumes in a quick and efficient way. Luckily, hardware, software, and communications solutions have experienced a huge progress in recent years, so technological viability is hardly an obstacle nowadays.\n\nSome additional issues and barriers \nThe implementation of big data solutions and tools in the healthcare field requires addressing not only the organizational and technological issues detailed above, but also several legal and ethical questions. \nFrom a legal point of view, the first cause of conflict may be data propriety. As explained in previous sections, every data properly processed and analyzed can be turned into knowledge, and the latter can be easily made profitable. The first companies working this angle are tech giants such as Google, which provides personalized advertisements based on navigation and search history, and Facebook, which admitted to focus part of its efforts on sociological research based on its users\u2019 data, and has even tried to take possession of these information in a completely unilateral way.[7]\nGiven that the generation, registry, and processing of all this information requires a powerful hardware and software infrastructure, and therefore a large investment by these companies, their intention to make it profitable may be considered legitimate to a certain point, especially if they are not charging users for the service provided. However, limitations regarding the use of the stored data must be clearly established, something that seems to be far from being solved with the current legal framework, which is quite confusing. For instance, in the case of Spain, this framework combines European Union, national, regional, and sectorial (both health and e-government) regulations.[7] Moreover, most of this legislation is outdated to a large extent, since it was passed in a time when IT progress was nowhere near what we have today.[16]\nOnce at this point, a revision of these regulations, taking into account the current potential of big data solutions, as well as the foreseeable one on the short and mid-term, seems to be more than appropriate. Of course, this revision must be addressed with the goal of balancing the individual interests of patients (right to privacy) and professionals (legal certainty in the performance of their healthcare and management duties), as well as the general interests of society (research, education, or improvement of healthcare services, among others). To that end, protocols must be defined that combine both a priori measures, such as data anonymization, and a posteriori measures, such as thorough audits regarding the access and use of data. Keeping the human factor in mind, one of the most crucial a priori measures will always be raising the awareness of patients, professionals, and organizations.\nFrom an ethical point of view, quite a few similarities to the legal field can be observed. The fact that IT is going to play an increasingly important role in health systems seems to be widely accepted, since its potential as a key instrument for the transformation of the current model is appreciated. Nevertheless, there is also a great concern about the lack of transparency in the management of the large amounts of data guarded by healthcare organizations. For this reason, the promotion of more and better control measures is backed by bioethics experts, starting with the development of a specific legal framework that can be turned into clear and visible actions, thus transmitting a sense of security and contributing to promote the trust in healthcare data mining.[15]\n\nBig data in SERGAS: A case study \nWithin the Spanish National Health System, healthcare is accountable to the Autonomous Communities, which represent the regional level of the government, and each has a health service. In the case of the community of Galicia, this would be the Galician Health Service (Servizo Galego de Sa\u00fade, SERGAS).\nSERGAS relies on a business intelligence (BI) solution for the exploitation of structured datasets, these being provided by a regional database in which information supplied by the different hospitals and primary care centers of this health service is aggregated. In addition, a management system for information related to human resources and pharmaceutical expenditure provides structured data as well.\nIn order to complement this BI system, SERGAS has implemented big data technologies so as to exploit unstructured data stored in the patients\u2019 electronic health records. This innovation makes SERGAS the first Spanish health service to use big data in a systematic way. On a total budget of 982,278 euros, several projects have been developed regarding the following lines of action:\nRare disease management\n\n detection of suspicious cases\n creation of a rare diseases registry\nChronic disease management\n\n detection of diabetes mellitus type 2 patients, chronic obstructive pulmonary disease patients, and patients with pluripathology, yet uncategorized as such in their health records\n calculation of prevalence and incidence indicators, as well as risk factors\nClinical research\n\n decision-making support regarding the selection of the most appropriate kind of vascular endoprostheses (stents)\nNosocomial infection management\n\u2022 research and categorization of detected cases\n\u2022 automated alerts\nSurveillance of several syndromes\n\u2022 case identification\n\u2022 detection of food toxi-infection and acute respiratory symptom outbreaks\nExploitation of lab test results \n\n currently in progress\nAs a whole, these systems are handling information belonging to 2.9 million patients, provided by 63 different data sources. As of 2016, 59 million normalized events have been compiled, 12 million documents (of 50 different kinds) have been semantically processed, and 500,000 cases have been detected. \nRegarding information security, SERGAS applies a set of corporate criteria, with standard measures such as the definition of user profiles and access authorization levels, the anonymization of aggregate data, and the performance of audits to verify regulation compliance. Additionally, there are several committees that define the guidelines for the management of ethics and governance, always within the current legal framework.\n\nConclusions \nAs with IT in general, the successful implementation of big data solutions in a healthcare environment will depend on their capability to generate added value that benefits patients, professionals, and organizations. No one seems to doubt the need to improve public health systems by evolving their current model, or the potentially valuable contributions of big data in this respect, but the great complexity that characterizes the implementation of these kinds of tools seems to be proven too, according to the requirements, and, in some cases, obstacles of a different nature that must be dealt with.\nOnce technological viability is apparently achieved, it is time for healthcare organizations and authorities to face the challenge of studying the possibilities of big data and seeking the best way of applying it to the solution of their issues, problems, and needs. In order to achieve this, they must not start wondering what information they have now and what they can achieve with it, instead focusing on what information they need and how they can get it. The most frequent problem will not be the availability of the necessary data, but the screening of the relevant information and how to assess it. In summary, the most important thing is not having the data, since this is already happening, but being able to ask the right questions at the right moment, process them to provide only the necessary and relevant information, and show the latter to healthcare professionals in such a way that they can assimilate it in a quick, correct, and easy manner in order to make the right decisions at the right time.\nOn the healthcare side, big data must become the foundation of clinical decision-making support systems, and also an instrument for data aggregation concerning public health departments, as well as research and education. On the management side, managers will have access to more accurate and timely knowledge concerning the real status of their organizations and adopt a proactive plan instead of a retrospective one. In addition, they will be capable of detecting deviations from objectives earlier and applying the appropriate corrective and, preferably, preventive measures. \nIn conclusion, the implementation of big data must be one of the main instruments for change in the current health system model, changing it into one with improved effectiveness and efficiency, taking into account both healthcare and economic outcomes of health services, thus being meaningful to patients and also to society, all while taking advantage of patients\u2019 potential as active participants in their own care.\n\nAcknowledgements \nThe authors wish to thank Ms. Pilar Carnicero and Mr. Guillermo V\u00e1zquez for their contributions in elaborating on and improving this article.\n\nReferences \n\n\n\u2191 World Health Organization (2007). Everybody's business -- Strengthening health systems to improve health outcomes: WHO's framework for action. World Health Organization. pp. 44. ISBN 9789241596077. http:\/\/www.who.int\/iris\/handle\/10665\/43918 .   \n\n\u2191 \"The Tallinn Charter: Health Systems for Health and Wealth\". World Health Organization. 27 June 2008. http:\/\/www.euro.who.int\/en\/publications\/policy-documents\/tallinn-charter-health-systems-for-health-and-wealth .   \n\n\u2191 Rojas, D.; Carnicero, J. (2015). \"A Model Of Information System For Healthcare: Global Vision and Integrated Data Flows\". In Berhardt, L.V.. Advances in Medicine and Biology. 82. Nova Science Publishers. ISBN 9781634636339. https:\/\/www.novapublishers.com\/catalog\/product_info.php?products_id=52835 .   \n\n\u2191 Carnicero, J.; Rojas, D.; Gonz\u00e1lez, A. et al. (2016) (PDF). La explotaci\u00f3n de datos de salud: Retos, oportunidades y l\u00edmites. Sociedad Espa\u00f1ola de Inform\u00e1tica de la Salud. ISBN 9788460889472. http:\/\/www.seis.es\/documentos\/Informe%20La%20explotacion%20de%20datos%20de%20Salud\/LA%20EXPLOTACI%C3%93N%20DE%20DATOS%20DE%20SALUD.pdf .   \n\n\u2191 5.0 5.1 Hansen, M.M.; Miron-Shatz, T.; Lau, A.Y.; Paton, C. (2014). \"Big Data in Science and Healthcare: A Review of Recent Literature and Perspectives - Contribution of the IMIA Social Media Working Group\". Yearbook of Medical Informatics 9: 21\u20136. doi:10.15265\/IY-2014-0004. PMC PMC4287084. PMID 25123717. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4287084 .   \n\n\u2191 Porter, M.E.; Lee, T.H. (2013). \"The Strategy That Will Fix Health Care\". Harvard Business Review 10: 50\u201370. https:\/\/hbr.org\/2013\/10\/the-strategy-that-will-fix-health-care .   \n\n\u2191 7.0 7.1 7.2 Mart\u00ednez, R., Rojas, D. (2014). \"Gesti\u00f3n de la seguridad de la informaci\u00f3n en atenci\u00f3n primaria y uso responsable de Internet y de las redes sociales\". In Carnicero, J.; Fern\u00e1ndez, A.; Rojas de la Escalera, D.. Manual de salud electr\u00f3nica para directivos de servicios y sistemas de salud. 2. United Nations. https:\/\/repositorio.cepal.org\/handle\/11362\/37058 .   \n\n\u2191 8.0 8.1 8.2 8.3 8.4 Belle, A.; Thiagarajan, R.; Soroushmehr, S.M.R. et al. (2015). \"Big data analytics in healthcare\". BioMed Research International 2015 (2015): 370194. doi:10.1155\/2015\/370194.   \n\n\u2191 Luo, J.; Wu, M.; Gopukumar, D.; Zhao, Y. (2016). \"Big Data Application in Biomedical Research and Health Care: A Literature Review\". Biomedical Informatics Insights 8: 1\u201310. doi:10.4137\/BII.S31559. PMC PMC4720168. PMID 26843812. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4720168 .   \n\n\u2191 National Academies of Sciences, Engineering, and Medicine (2016). Big Data and Analytics for Infectious Disease Research, Operations, and Policy: Proceedings of a Workshop. The National Academies Press. doi:10.17226\/23654. ISBN 9780309450140.   \n\n\u2191 11.0 11.1 11.2 11.3 11.4 Panahiazar, M.; Taslimitehrani, V.; Jadhav, A.; Pathak, J. (2014). \"Empowering Personalized Medicine with Big Data and Semantic Web Technology: Promises, Challenges, and Use Cases\". Proceedings of the IEEE International Conference on Big Data 2014: 790\u20135. doi:10.1109\/BigData.2014.7004307. PMC PMC4333680. PMID 25705726. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4333680 .   \n\n\u2191 12.0 12.1 12.2 12.3 12.4 12.5 12.6 Kuziemsky, C.E.; Monkman, H.; Petersen, C. et al. (2014). \"Big Data in Healthcare - Defining the Digital Persona through User Contexts from the Micro to the Macro. Contribution of the IMIA Organizational and Social Issues WG\". Yearbook of Medical Informatics 2014: 82\u20139. doi:10.15265\/IY-2014-0014. PMC PMC4287094. PMID 25123726. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4287094 .   \n\n\u2191 Manyika, J.; Chui, M.; Brown, B. et al. (May 2011). \"Big data: The next frontier for innovation, competition, and productivity\". McKinsey Global Institute. https:\/\/www.mckinsey.com\/business-functions\/digital-mckinsey\/our-insights\/big-data-the-next-frontier-for-innovation .   \n\n\u2191 Carnicero, J.; Rojas, D.; Mart\u00ednez, R. et al. (2016) (PDF). XI Informe SEIS - Las TIC y la seguridad de los pacientes: Primum non nocere. Sociedad Espa\u00f1ola de Inform\u00e1tica de la Salud. ISBN 9788461740246. http:\/\/www.seis.es\/documentos\/XI%20InformeSeis.pdf .   \n\n\u2191 15.0 15.1 Le\u00f3n, P. (2016). \"Cap\u00edtulo III: Bio\u00e9tica y explotaci\u00f3n de grandes conjuntos de datos\". In Carnicero, J.; Rojas, D.; Gonz\u00e1lez, A. et al. (PDF). La explotaci\u00f3n de datos de salud: Retos, oportunidades y l\u00edmites. Sociedad Espa\u00f1ola de Inform\u00e1tica de la Salud. ISBN 9788460889472. http:\/\/www.seis.es\/documentos\/Informe%20La%20explotacion%20de%20datos%20de%20Salud\/LA%20EXPLOTACI%C3%93N%20DE%20DATOS%20DE%20SALUD.pdf .   \n\n\u2191 And\u00e9rez, A. (2016). \"Cap\u00edtulo IV: Disposiciones legales aplicables\". In Carnicero, J.; Rojas, D.; Gonz\u00e1lez, A. et al. (PDF). La explotaci\u00f3n de datos de salud: Retos, oportunidades y l\u00edmites. Sociedad Espa\u00f1ola de Inform\u00e1tica de la Salud. ISBN 9788460889472. http:\/\/www.seis.es\/documentos\/Informe%20La%20explotacion%20de%20datos%20de%20Salud\/LA%20EXPLOTACI%C3%93N%20DE%20DATOS%20DE%20SALUD.pdf .   \n\n\nNotes \nThis presentation is faithful to the original, with only a few minor changes to grammar, spelling, and presentation, including the addition of PMCID and DOI when they were missing from the original reference. Citation three is listed in the references of the original but inadvertently omitted from the inline citations; it has been placed in the text at what is believed to be the appropriate citation point. Additionally, the original document placed citations 11 and 12 before nine and 10; this version shows citations in order of appearance, by design.\n\n\n\n\n\n\nSource: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Big_data_and_public_health_systems:_Issues_and_opportunities\">https:\/\/www.limswiki.org\/index.php\/Journal:Big_data_and_public_health_systems:_Issues_and_opportunities<\/a>\n\t\t\t\t\tCategories: LIMSwiki journal articles (added in 2018)LIMSwiki journal articles (all)LIMSwiki journal articles on health informatics\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\n\t\t\n\t\t\tNavigation menu\n\t\t\t\t\t\n\t\t\tViews\n\n\t\t\t\n\t\t\t\t\n\t\t\t\tJournal\n\t\t\t\tDiscussion\n\t\t\t\tView source\n\t\t\t\tHistory\n\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\t\n\t\t\t\tPersonal tools\n\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\t\t\tLog in\n\t\t\t\t\t\t\t\t\t\t\t\t\tRequest account\n\t\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\t\n\t\tNavigation\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tMain page\n\t\t\t\t\t\t\t\t\t\t\tRecent changes\n\t\t\t\t\t\t\t\t\t\t\tRandom page\n\t\t\t\t\t\t\t\t\t\t\tHelp\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tSearch\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t \n\t\t\t\t\t\t\n\t\t\t\t\n\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tTools\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tWhat links here\n\t\t\t\t\t\t\t\t\t\t\tRelated changes\n\t\t\t\t\t\t\t\t\t\t\tSpecial pages\n\t\t\t\t\t\t\t\t\t\t\tPermanent link\n\t\t\t\t\t\t\t\t\t\t\tPage information\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\tPrint\/export\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tCreate a book\n\t\t\t\t\t\t\t\t\t\t\tDownload as PDF\n\t\t\t\t\t\t\t\t\t\t\tDownload as Plain text\n\t\t\t\t\t\t\t\t\t\t\tPrintable version\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\n\t\tSponsors\n\t\t\n\t\t\t \r\n\n\t\r\n\n\t\r\n\n\t\r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n\t\n\t\r\n\n\t\r\n\n\t\r\n\n\t\r\n\t\t\n\t\t\n\t\t\t\n\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t This page was last modified on 20 March 2018, at 23:11.\n\t\t\t\t\t\t\t\t\tThis page has been accessed 332 times.\n\t\t\t\t\t\t\t\t\tContent is available under a Creative Commons Attribution-ShareAlike 4.0 International License unless otherwise noted.\n\t\t\t\t\t\t\t\t\tPrivacy policy\n\t\t\t\t\t\t\t\t\tAbout LIMSWiki\n\t\t\t\t\t\t\t\t\tDisclaimers\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\t\n\n","b94dc07071fd3149fbecd75f93d73558_html":"<body class=\"mediawiki ltr sitedir-ltr ns-206 ns-subject page-Journal_Big_data_and_public_health_systems_Issues_and_opportunities skin-monobook action-view\">\n<div id=\"rdp-ebb-globalWrapper\">\n\t\t<div id=\"rdp-ebb-column-content\">\n\t\t\t<div id=\"rdp-ebb-content\" class=\"mw-body\" role=\"main\">\n\t\t\t\t<a id=\"rdp-ebb-top\"><\/a>\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t<h1 id=\"rdp-ebb-firstHeading\" class=\"firstHeading\" lang=\"en\">Journal:Big data and public health systems: Issues and opportunities<\/h1>\n\t\t\t\t\n\t\t\t\t<div id=\"rdp-ebb-bodyContent\" class=\"mw-body-content\">\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\n\t\t\t\t\t<!-- start content -->\n\t\t\t\t\t<div id=\"rdp-ebb-mw-content-text\" lang=\"en\" dir=\"ltr\" class=\"mw-content-ltr\">\n\n\n<h2><span class=\"mw-headline\" id=\"Abstract\">Abstract<\/span><\/h2>\n<p>In recent years, the need for changing the current model of European public health systems has been repeatedly addressed, in order to ensure their sustainability. Following this line, information technology (IT) has always been referred to as one of the key instruments for enhancing the <a href=\"https:\/\/www.limswiki.org\/index.php\/Information_management\" title=\"Information management\" target=\"_blank\" class=\"wiki-link\" data-key=\"f8672d270c0750a858ed940158ca0a73\">information management<\/a> processes of healthcare organizations, thus contributing to the improvement and evolution of health systems. On the IT field, big data solutions are expected to play a main role, since they are designed for handling huge amounts of <a href=\"https:\/\/www.limswiki.org\/index.php\/Information\" title=\"Information\" target=\"_blank\" class=\"wiki-link\" data-key=\"6300a14d9c2776dcca0999b5ed940e7d\">information<\/a> in a fast and efficient way, allowing users to make important decisions quickly. This article reviews the main features of the European public health system model and the corresponding healthcare and management-related information systems, the challenges that these health systems are currently facing, and the possible contributions of big data solutions to this field. To that end, the authors share their professional experience on the Spanish public health system and review the existing literature related to this topic.\n<\/p><p><b>Keywords<\/b>: big data, health system, healthcare organizations, health information systems, epidemiological surveillance, strategic planning\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Introduction\">Introduction<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"The_health_system\">The health system<\/span><\/h3>\n<p>According to the World Health Organization (WHO), \u201ca health system consists of all organizations, people and actions whose primary intent is to promote, restore or maintain health. This includes efforts to influence determinants of health as well as more direct health-improving activities. A health system is therefore more than the pyramid of publicly owned facilities that deliver personal health services.\u201d<sup id=\"rdp-ebb-cite_ref-WHOEvery07_1-0\" class=\"reference\"><a href=\"#cite_note-WHOEvery07-1\" rel=\"external_link\">[1]<\/a><\/sup> Furthermore, every health system performs the following set of basic functions<sup id=\"rdp-ebb-cite_ref-WHOTallin08_2-0\" class=\"reference\"><a href=\"#cite_note-WHOTallin08-2\" rel=\"external_link\">[2]<\/a><\/sup>:\n<\/p>\n<ul><li> delivering health services to individuals and to populations<\/li>\n<li> creating resources<\/li>\n<li> providing stewardship<\/li>\n<li> financing the system<\/li><\/ul>\n<p>The center of any health system must be the first of these functions, since healthcare constitutes the paramount goal and therefore the reason for the existence of the health system itself. Around it, other functions are organized, essential for ensuring healthcare delivery and public health. Among these, the following must be remarked upon:\n<\/p>\n<ul><li> epidemiological surveillance, which comprises the collection and analysis of large volumes of data directly or indirectly related to people\u2019s health, so as to detect or prevent possible health problems regarding public health<\/li>\n<li> planning and overseeing the management of the health system, which allows healthcare organizations to set out their strategic goals, allocate the necessary resources, assess the degree of compliance of these goals and apply corrective measures if required<\/li>\n<li> clinical research, focused on generating knowledge and applying it to the development of new diagnostic and therapeutic techniques<\/li>\n<li> education and teaching, in order to train new professionals and keep the practicing ones appropriately updated and competent<\/li><\/ul>\n<h3><span class=\"mw-headline\" id=\"The_health_cluster_or_ecosystem\">The health cluster or ecosystem<\/span><\/h3>\n<p>From a structural point of view, a health system is neither an isolated nor homogeneous entity, but rather it comprehends or relates to entities of diverse nature, both public and private, with interests of their own, as well as shared interests. This ensemble is known as health cluster or ecosystem, and among its components the following must be pointed out<sup id=\"rdp-ebb-cite_ref-RojasAModel15_3-0\" class=\"reference\"><a href=\"#cite_note-RojasAModel15-3\" rel=\"external_link\">[3]<\/a><\/sup>:\n<\/p>\n<ul><li> central or federal government and regional or local authorities <\/li>\n<li> healthcare services, conceived as organizations responsible for the management of a determined healthcare network <\/li>\n<li> <a href=\"https:\/\/www.limswiki.org\/index.php\/Hospital\" title=\"Hospital\" target=\"_blank\" class=\"wiki-link\" data-key=\"b8f070c66d8123fe91063594befebdff\">hospitals<\/a><\/li>\n<li> primary care centers<\/li>\n<li> emergency services<\/li>\n<li> pharmacies<\/li>\n<li> convalescent centers<\/li>\n<li> health professionals acting as external providers to the health system<\/li>\n<li> public health services<\/li>\n<li> insurance companies, mutual societies, and other entities which finance healthcare<\/li>\n<li> schools for the education and training of doctors, nurses, and other health professionals<\/li>\n<li> research centers<\/li>\n<li> professional associations and colleges<\/li>\n<li> foundations and learned societies<\/li>\n<li> stakeholders, such as patients associations <\/li>\n<li> pharmaceutical and other health technology industries<\/li><\/ul>\n<h3><span class=\"mw-headline\" id=\"Challenges_faced_by_the_health_system\">Challenges faced by the health system<\/span><\/h3>\n<p>For decades, the public health systems of European countries, created following the end of World War II, have been frequently mentioned as a reference model to be followed, especially in those aspects regarding coverage, quality of service, and contribution to the welfare of society. However, the scene in which these systems arose has suffered a series of major changes, being the most important the following ones<sup id=\"rdp-ebb-cite_ref-CarniceroLaExplot16_4-0\" class=\"reference\"><a href=\"#cite_note-CarniceroLaExplot16-4\" rel=\"external_link\">[4]<\/a><\/sup>:\n<\/p>\n<ul><li> the aging of the population, with a continuous increment of chronic and degenerative diseases<\/li>\n<li> the financial crisis, which causes important budget cuts in the public funds meant to finance the health systems activities, and makes it more difficult\u2014or even impossible\u2014for the citizens to compensate these cuts with out-of-pocket expenses<\/li>\n<li> the creation of new techniques and drugs, more effective but also more expensive, mainly due to the necessity to compensate the research costs caused by their development<\/li>\n<li> the increasing demands of the citizens, who require more and better healthcare services in a setting that seeks patient empowerment and promotion of personalized medicine<\/li><\/ul>\n<p>As a token of the first two determinants, aging of the population and public budget cuts, the Spanish case is addressed below. Table I shows the progress of these two indicators during the period between 2003 and 2014.\n<\/p>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table class=\"wikitable\" border=\"1\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" colspan=\"10\"><b>Table 1.<\/b> Demographics and health expenditure in Spain (2003\u20132014)<br \/>Sources - demographics: Demographic Information System, Spanish National Statistics Institute (INE); health expenditure date: OECD Health Statistics\n<\/td><\/tr>\n<tr>\n<th style=\"padding-left:10px; padding-right:10px;\" rowspan=\"2\">Year\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\" rowspan=\"2\">Total population\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\" rowspan=\"2\">Population 15\u201364 years\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\" colspan=\"2\">Population older than 64 years\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\" rowspan=\"2\">Dependency ratio\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\" colspan=\"2\">Public health expenditure\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\" colspan=\"2\">Private health expenditure\n<\/th><\/tr>\n<tr>\n<th style=\"padding-left:10px; padding-right:10px;\">People\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">% over total\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">M\u20ac\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">% GDP\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">M\u20ac\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">% GDP\n<\/th><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2003\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">42,717,064\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">29,396,965\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">7,276,620\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">17.03%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">24.75%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">43,158.4\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">5.37%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">17,354.5\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2.16%\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2004\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">43,197,684\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">29,777,965\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">7,301,009\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">16.90%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">24.52%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">46,992.4\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">5.46%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">18,651.1\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2.17%\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2005\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">44,108,530\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">30,511,110\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">7,332,267\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">16.62%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">24.03%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">51,351.5\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">5.52%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">20,094.2\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2.16%\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2006\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">44,708,964\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">30,849,177\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">7,484,392\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">16.74%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">24.26%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">56,662.2\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">5.62%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">21,520.7\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2.14%\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2007\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">45,200,737\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">31,188,079\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">7,531,826\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">16.66%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">24.15%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">61,612.0\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">5.70%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">23,101.9\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2.14%\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2008\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">46,157,822\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">31,869,008\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">7,632,925\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">16.54%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">23.95%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">68,147.1\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">6.11%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">24,392.9\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2.19%\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2009\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">46,745,807\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">32,145,023\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">7,782,904\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">16.65%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">24.21%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">73,035.6\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">6.77%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">23,863.0\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2.21%\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2010\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">47,021,031\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">32,153,527\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">7,931,164\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">16.87%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">24.67%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">72,852.6\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">6.74%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">24,593.7\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2.28%\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2011\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">47,190,493\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">32,082,758\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">8,093,557\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">17.15%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">25.23%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">71,800.0\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">6.68%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">25,510.2\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2.38%\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2012\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">47,265,321\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">31,980,402\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">8,222,196\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">17.40%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">25.71%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">68,262.9\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">6.47%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">26,594.3\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2.55%\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2013\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">47,129,783\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">31,718,285\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">8,335,861\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">17.69%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">26.28%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">65,718.5\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">6.26%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">26,981.3\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2.62%\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2014\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">46,771,341\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">31,281,943\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">8,442,427\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">18.05%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">26.99%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">65,975.7\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">6.34%\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">28,558.1\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2.74%\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>These data reveal that the Spanish population has increased from 42.72 to 46.77 million people during the 2003-2014 period, while the percentage of people older than 64 years has risen from 17.03% to 18.05% over total population, and the dependency ratio, which indicates the ratio between population older than 64 years and population between 15 and 64 years old, has risen from 24.75% to 26.99%. On the other hand, public health expenditure in 2003 meant 5.37% GDP, reaching a peak of 6.77% in 2009 and falling to 6.26% in 2013, experiencing a small recovery in 2014, with 6.34% GDP. Regarding private health expenditure, it was at minimums around 2.14%-2.17% GDP but it has risen year after year since the beginning of the financial crisis in 2008, reaching 2.74% GDP in 2014. \n<\/p><p>All things considered, the impact of all these determinants is so important that the sustainability of this model of public health system has been questioned in recent years.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"The_transformation_of_public_health_systems\">The transformation of public health systems<\/span><\/h3>\n<p>Despite the fact that the challenges explained above make clear that a deep transformation of this health system model is needed, and IT is often considered as one of the main facilitators for this change, it is not admissible to think that health systems are going to lose their essential features. Health systems must improve people\u2019s health, from both an individual and a collective point of view, and this final goal will not change in spite of the introduction of new technologies impacting big data. \n<\/p><p>The patient must always be the center of any health system and, in the same way, health information must always be the center of a <a href=\"https:\/\/www.limswiki.org\/index.php\/Clinical_informatics\" title=\"Clinical informatics\" class=\"mw-redirect wiki-link\" target=\"_blank\" data-key=\"bda8123083aecb94afe79afec9ae4686\">health information system<\/a>, which will be introduced below. The actions of a clinic professional focus on the achievement of specific healthcare goals customized for each one of their patients, improving or maintaining their health status. Besides knowledge, healthcare requires a connected and personalized relationship between the provider and the patient, so that interventions are tailored to the patient\u2019s unique preferences and behavior, as with, for instance, drug adherence. Different people will have different reasons for non-adherence.<sup id=\"rdp-ebb-cite_ref-HansenBigData14_5-0\" class=\"reference\"><a href=\"#cite_note-HansenBigData14-5\" rel=\"external_link\">[5]<\/a><\/sup>\n<\/p><p>On their behalf, health systems managers must seek the compliance of the general goals defined by their organizations. These goals will be the aggregate of the individual goals related to each one of the professionals in their clinical staff. In addition, these managers will also be responsible for the allocation of the necessary resources and the financing of the whole activity in their organizations.\n<\/p><p>On the whole, health systems must focus their efforts on the creation of value for both the patient and society. To that end, clear goals must be defined that find an appropriate balance between the patient\u2019s personal interests and the collective interest of society. For instance, in the event of a surgical intervention it is mandatory to measure some indicators such as mortality rate, adverse events, time of recovery, care costs, or time for the patients to return to their jobs at full capacity. Nevertheless, it is necessary to also take into account other indicators, maybe more subjective and thus harder to measure, but equally important because of their impact on the patient, such as post-surgery functionality, pain suffered, or the cost of all these factors from a quality-of-life point of view. If health systems are not focused on their patients\u2019 interests and on achieving the corresponding goals, they will hardly be able to change and ensure their sustainability.<sup id=\"rdp-ebb-cite_ref-PorterTheStrat13_6-0\" class=\"reference\"><a href=\"#cite_note-PorterTheStrat13-6\" rel=\"external_link\">[6]<\/a><\/sup>\n<\/p><p>This article reviews the main features of the European public health system model and the corresponding healthcare and management-related information systems, the problems and challenges that these health systems are currently facing, and the solutions that big data tools may potentially offer in that respect. To that end, the authors have based this work on their professional experience with the Spanish public health system, an analysis of the scene that the latter is facing in the upcoming years and decades, and a review of the existing literature on big data applied to health.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Big_data_solutions_in_a_healthcare_environment\">Big data solutions in a healthcare environment<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"The_health_information_system\">The health information system<\/span><\/h3>\n<p>The individual performance of the different components of the health cluster, as well as their interactions, causes the creation of multiple data flows, which are also greatly varied, since they involve several business processes. This set of data flows creates the health information system as a result. \n<\/p><p>As mentioned above, just as healthcare is the center of any health system, patient-related healthcare information must be the center of the health information system as well, since it may and will also be used for activities other than healthcare, such as epidemiological surveillance, planning, overseeing of management, clinical research, and education and training, as stated in the introduction. The fact that these data are stored in healthcare information systems is a consequence of them being generated during the patient\u2019s care, but their usefulness goes clearly beyond this limit. \n<\/p><p>Therefore, the health information system must allow users to register, process, consult and share large amounts of data, ensuring their availability at the appropriate moment and point of the health cluster. On the healthcare side, this cluster reveals itself as a huge generator and at the same time consumer of enormous sets of information, related to personalized healthcare processes that take place on a daily basis and in a massive way. Healthcare is considerably intense regarding data treatment, by constantly creating immense datasets and frequently requiring access to knowledge sources.\n<\/p><p>For IT to be fully integrated in the health system value chain, it is mandatory to have a health information system which serves as an instrument for knowledge management being useful to all its users. Healthcare professionals cannot perform their duties properly without registering and using patients\u2019 information, or without accessing the knowledge sources that allow them to make decisions on a solid basis. Public health departments need to know the population health status in order to detect or prevent potential collective health issues, as well as defining the necessary corrective and preventive measures. To those ends, these professionals must rely on data generated during every patient\u2019s healthcare encounter, properly aggregated, as well as other data sources.\n<\/p><p>Managers are not able to plan a strategy, oversee its performance, and assess the achieved outcomes without a tool that allows them to process all the necessary information and provides them with accurate data, timely and in due form. These data are required from the very beginning, since the definition of an appropriate strategy must be based on the knowledge of the population\u2019s health status, complemented with projections of its potential progress.\n<\/p><p>This complexity has been increased in recent years by a major change in healthcare organizations, which have evolved from a clearly paternalist way of interaction with their patients to a completely different one, focused on seeking their empowerment. In addition, patients are not content anymore with the information provided to them by their doctors, but search the internet for additional data about their diseases, engage on social networks, make their own decisions, and register information on their health records. This new role is indeed required if, for instance, health authorities seek to promote one of the most important lines of action in the field of chronic patient management, self-care encouragement, which has a beneficial impact on both the patient and the health system. However, this requires also a more varied interaction between them, combining traditional simple events, like setting up appointments with a general practitioner, with more complex actions, like monitoring health data analyzed and stored by wearable devices.\n<\/p><p>Despite this, it is clearly positive that both society and the medical community have evolved from a discussion about giving patients clearance to access their own healthcare information, to a totally different one about seeking the best way for the patients to register data in their health records, either in an active and conscious way or in an passive and automated one via specific devices. In any case, it must be always taken into account that the management of healthcare information is not a process unrelated to healthcare but rather an inseparable component of healthcare itself, hence its management and supervision are the healthcare professionals\u2019 responsibility, even though the patients take a more active role. Furthermore, every professional must accept this new reality and provide the patient with the necessary training so that this initiative ends up being successful.<sup id=\"rdp-ebb-cite_ref-Mart.C3.ADnezGesti.C3.B3n14_7-0\" class=\"reference\"><a href=\"#cite_note-Mart.C3.ADnezGesti.C3.B3n14-7\" rel=\"external_link\">[7]<\/a><\/sup>\n<\/p><p>Apart from this, the temptation of exploiting the information stored on different social networks turns out to be very powerful. It is true that, to the health system, it is a possibility worth exploring, but several conditioning factors must also be considered. The first one is data protection as a consequence of people\u2019s right to privacy, something that, from the beginning, seems to collide clearly with the business model of social networks themselves, designed to share large amounts of information in a quick, heterogeneous, and, up to some point, uncontrolled way. \n<\/p><p>Precisely these features represent another important conditioning factor, since social networks are nothing but huge repositories that store unstructured, poorly classified, or simply uncategorized data, not to mention the more than likely irrelevance of most of them regarding healthcare and, moreover, their doubtful veracity, a feature essential to this field. Given their market penetration, with millions of users around the world, it seems advisable to assess the possibility of using social networks as an information source for health systems, as long as a model can be defined that solves or at least mitigates all the inconveniences mentioned above.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Potential_contributions_of_big_data_to_health_systems\">Potential contributions of big data to health systems<\/span><\/h3>\n<p>The field of big data analytics is rapidly expanding, up to the point that it has begun to play a main role in the evolution of healthcare practices and research, by providing tools to register, manage, and analyze huge amounts of both structured and unstructured data produced by current healthcare information systems.<sup id=\"rdp-ebb-cite_ref-BelleBigData15_8-0\" class=\"reference\"><a href=\"#cite_note-BelleBigData15-8\" rel=\"external_link\">[8]<\/a><\/sup>\n<\/p><p>Health-related big data streams can be classified into three categories<sup id=\"rdp-ebb-cite_ref-HansenBigData14_5-1\" class=\"reference\"><a href=\"#cite_note-HansenBigData14-5\" rel=\"external_link\">[5]<\/a><\/sup>:\n<\/p>\n<ul><li> Traditional healthcare data are generated within the health system and stored in datasets such as health records, medical imaging tests, lab reports, or pathology results, among others. Analyzing this information allows to achieve a better understanding of disease outcomes and their risk factors, and also to reduce health system costs, thus making them more efficient.<\/li><\/ul>\n<ul><li> \u201cOmics\u2019\u2019 data deal with large-scale datasets in the biological and molecular fields, such as genomics, microbiomics or proteomics, for instance. The study of this information leads to deeper knowledge about how diseases behave, accelerating the individualization of medical treatments.<\/li><\/ul>\n<ul><li> Data from social media allow to figure out how individuals or groups use the internet, social media, apps, sensor devices, wearable devices or any other tools to better inform and enhance their health.<\/li><\/ul>\n<p>Additionally, the inclusion of geographical and environmental information may further increase the ability to interpret gathered data and extract new knowledge.<sup id=\"rdp-ebb-cite_ref-LuoBigData16_9-0\" class=\"reference\"><a href=\"#cite_note-LuoBigData16-9\" rel=\"external_link\">[9]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-NASEMBigData16_10-0\" class=\"reference\"><a href=\"#cite_note-NASEMBigData16-10\" rel=\"external_link\">[10]<\/a><\/sup>\n<\/p><p>Combinations of several types of data must also be taken into account. The concept of personalized medicine, partially introduced above, seeks to combine the patient\u2019s health record and genomic data in order to support the clinical decision-making process, making it predictive, personalized, preventive and participatory, an idea known as \u201cP4 Medicine.\u201d<sup id=\"rdp-ebb-cite_ref-PanahiazarEmpower14_11-0\" class=\"reference\"><a href=\"#cite_note-PanahiazarEmpower14-11\" rel=\"external_link\">[11]<\/a><\/sup>\n<\/p><p>At the micro level, personalized medicine aims to customize the diagnosis of a disease and the subsequent therapy by taking into account the individual patient\u2019s characteristics, instead of relying on decisions taken according to general guidelines, defined as a result of population-based studies and clinical trials. This will require the integration of clinical information, mainly patient records, and biological data such as genome or protein sequences. These data are generated from different and heterogeneous sources, and they have very diverse formats.<sup id=\"rdp-ebb-cite_ref-PanahiazarEmpower14_11-1\" class=\"reference\"><a href=\"#cite_note-PanahiazarEmpower14-11\" rel=\"external_link\">[11]<\/a><\/sup>\n<\/p><p>In fact, healthcare data no longer needs to be restricted to traditional datasets such as <a href=\"https:\/\/www.limswiki.org\/index.php\/Electronic_health_record\" title=\"Electronic health record\" target=\"_blank\" class=\"wiki-link\" data-key=\"f2e31a73217185bb01389404c1fd5255\">electronic health records<\/a>. For instance, mobile or wearable devices monitoring physiological signals can provide timely access to multiple data points that are increasingly interconnected. Traditionally, the data generated by these sorts of devices have not been stored for more than a brief period of time, being discarded afterwards and therefore preventing any extensive investigation to benefit from the exploitation of these data. However, attempts to use this kind of data have been increasing lately, in order to improve patient care and management.<sup id=\"rdp-ebb-cite_ref-BelleBigData15_8-1\" class=\"reference\"><a href=\"#cite_note-BelleBigData15-8\" rel=\"external_link\">[8]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-KuziemskyBigData14_12-0\" class=\"reference\"><a href=\"#cite_note-KuziemskyBigData14-12\" rel=\"external_link\">[12]<\/a><\/sup>\n<\/p><p>Nevertheless, there is a difference between collecting data, having access to data, and knowing how it should be used to improve healthcare. Now that the technology for handling massive amounts of data is available, the next step is developing tools for information sharing and knowledge management, which are seriously limited by the lack of system interoperability.<sup id=\"rdp-ebb-cite_ref-PanahiazarEmpower14_11-2\" class=\"reference\"><a href=\"#cite_note-PanahiazarEmpower14-11\" rel=\"external_link\">[11]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-KuziemskyBigData14_12-1\" class=\"reference\"><a href=\"#cite_note-KuziemskyBigData14-12\" rel=\"external_link\">[12]<\/a><\/sup>\n<\/p><p>For instance, with full interoperability the ability to collect data in a timely manner from several different sources leads to an increase in registries. Disease registries are still in an early stage, but they might be valuable tools when it comes to supporting patient-centered self-management of chronic illness and defining customized treatment plans. Besides, the integration of computer analysis with appropriate care will help doctors to improve diagnostic accuracy. In a similar way, the integration of medical images with other types of electronic health record data and genomic data can also improve the accuracy of a diagnosis and reduce the time required for it.<sup id=\"rdp-ebb-cite_ref-BelleBigData15_8-2\" class=\"reference\"><a href=\"#cite_note-BelleBigData15-8\" rel=\"external_link\">[8]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-KuziemskyBigData14_12-2\" class=\"reference\"><a href=\"#cite_note-KuziemskyBigData14-12\" rel=\"external_link\">[12]<\/a><\/sup>\n<\/p><p>A major emphasis of personalized medicine is to match the right drug with the right dosage to the right patient at the right time. Moreover, gene sequencing and the use of the subsequent genetic data in diagnosis and treatment will be essential to the future of personalized medicine, with actions such as the prescription of drugs based on genomic profiles of individual patients, known as pharmacogenomics. However, analytics of high-throughput sequencing techniques in genomics is a problem inherent to big data itself, since the human genome consists of 30,000-35,000 genes. Some ongoing projects aim to integrate clinical data from the genomic level to the physiological level of a human being. These initiatives will surely help when it comes to deliver personalized healthcare.<sup id=\"rdp-ebb-cite_ref-BelleBigData15_8-3\" class=\"reference\"><a href=\"#cite_note-BelleBigData15-8\" rel=\"external_link\">[8]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-PanahiazarEmpower14_11-3\" class=\"reference\"><a href=\"#cite_note-PanahiazarEmpower14-11\" rel=\"external_link\">[11]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-KuziemskyBigData14_12-3\" class=\"reference\"><a href=\"#cite_note-KuziemskyBigData14-12\" rel=\"external_link\">[12]<\/a><\/sup>\n<\/p><p>At the macro level, faster access to data allows any hospital to define and apply quality improvement policies based on the constant monitoring of outcomes, ensuring that the strategic goals of the organization are achieved. Hospitals have also used electronic health records, datasets originally intended to document individual healthcare processes, to identify system-related inefficiencies and quality issues. Faster access to data has also been hugely useful for the identification and management of disease outbreaks, allowing public health initiatives to be targeted to specific areas via population analysis.<sup id=\"rdp-ebb-cite_ref-KuziemskyBigData14_12-4\" class=\"reference\"><a href=\"#cite_note-KuziemskyBigData14-12\" rel=\"external_link\">[12]<\/a><\/sup>\n<\/p><p>Mining of electronic health record data has made it possible for researchers to identify possible sources of adverse events. Healthcare professionals have used this information to improve organizational practices and reduce error rates. Moreover, many clinical information systems such as electronic health records and <a href=\"https:\/\/www.limswiki.org\/index.php\/Computerized_physician_order_entry\" title=\"Computerized physician order entry\" target=\"_blank\" class=\"wiki-link\" data-key=\"f9e67e685f2b29f79e9b0991330f2b10\">computerized physician order entry systems<\/a> capture a large amount of metadata about their use, which can be used for auditing purposes, thus allowing the organization to detect user-device interaction problems, shrinking safety margins and minimizing other technology-related safety issues and concerns before any adverse event takes place.<sup id=\"rdp-ebb-cite_ref-KuziemskyBigData14_12-5\" class=\"reference\"><a href=\"#cite_note-KuziemskyBigData14-12\" rel=\"external_link\">[12]<\/a><\/sup>\n<\/p><p>The potential impact of big data is not easy to estimate, let alone at such an early stage. A report sponsored by the McKinsey Global Institute states that the proper use of big data within the United States healthcare sector might allow improvements with an estimated value of more than $300 billion every year, two-thirds of which would be achieved by reducing the healthcare expenditure of the whole country.<sup id=\"rdp-ebb-cite_ref-ManyikaBigData11_13-0\" class=\"reference\"><a href=\"#cite_note-ManyikaBigData11-13\" rel=\"external_link\">[13]<\/a><\/sup>\n<\/p><p>However, healthcare IT history has made clear that technology-based panaceas do not exist. The potential of IT for transforming health systems seems to be widely accepted as a consequence of its contribution to the improvement of healthcare processes, but IT has also caused new issues and risks, such as user-computer interaction problems or technology-induced errors. As a consequence, it seems clear that more IT outcome-based research is still needed in order not only to prove its value but also to quantify it.<sup id=\"rdp-ebb-cite_ref-KuziemskyBigData14_12-6\" class=\"reference\"><a href=\"#cite_note-KuziemskyBigData14-12\" rel=\"external_link\">[12]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-CarniceroLasTIC16_14-0\" class=\"reference\"><a href=\"#cite_note-CarniceroLasTIC16-14\" rel=\"external_link\">[14]<\/a><\/sup>\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Requirements_for_the_use_of_big_data_within_the_health_information_system\">Requirements for the use of big data within the health information system<\/span><\/h3>\n<p>While the ability to manage massive amounts of data provides a huge opportunity to develop methods and applications for advanced analysis, the real value of big data will only be achieved if the information extracted from these data is useful to improve clinical decision-making processes and patient outcomes, as well as lower healthcare costs.<sup id=\"rdp-ebb-cite_ref-PanahiazarEmpower14_11-4\" class=\"reference\"><a href=\"#cite_note-PanahiazarEmpower14-11\" rel=\"external_link\">[11]<\/a><\/sup> To that end, several basic requirements must be met, though they are very similar to the requirements of the health information system itself.\n<\/p><p>First of all, it is essential to ensure the quality of the information. This involves the development of thorough protocols which define the criteria required for data input, validation, harmonization, registry, processing, and transmission to other components of the information system. In fact, several of the main requirements of data mining are the technical correctness of data, the accuracy and statistical performance, and the update or reassessment of the analysis.<sup id=\"rdp-ebb-cite_ref-CarniceroLaExplot16-2_15-0\" class=\"reference\"><a href=\"#cite_note-CarniceroLaExplot16-2-15\" rel=\"external_link\">[15]<\/a><\/sup>\n<\/p><p>In the health field, the information managed is so complex and heterogeneous that it is necessary to employ data carefully structured and, as long as it is possible, categorized. This is useful for data identification and error control purposes. Furthermore, healthcare information is a perfect example of three major features, commonly known as the three Vs, widely accepted as defining characteristics of big data: volume, variety, and velocity. In addition to these a fourth V, the veracity of healthcare data, is obviously critical for its meaningful use.<sup id=\"rdp-ebb-cite_ref-BelleBigData15_8-4\" class=\"reference\"><a href=\"#cite_note-BelleBigData15-8\" rel=\"external_link\">[8]<\/a><\/sup>\n<\/p><p>All possible information sources and data flows within the health information system must be perfectly identified as well. Since the information system must store all data required for the performance of the different corporate functions of the health system, it is clear that all of its components must be interoperable, as stated above, so that any data can be accessed from any point of the health system that needs them. Hence another cardinal requirement is the interoperability of systems, subsystems, and components, defined as their capability of exchanging information without altering the meaning of the exchanged data, regardless of their source and their use within each system. \n<\/p><p>For instance, a medical consultation generates information used for the patient\u2019s healthcare, the management of the employed resources, and the billing of the service, but it can also be used in the medium and long-term for outcome assessment, strategic planning, research, education, epidemiological surveillance, or even as evidence in legal proceedings. Moreover, the aggregate of every data generated during that consultation and the ones generated during millions of similar healthcare events will be useful to create knowledge, on which [[clinical decision support system]s will be based.\n<\/p><p>Therefore the cycle comprehends the transition from data to information, from information to knowledge, and from knowledge to practice. All of this needs the interoperability of clinical information systems, logistic and economic-financial systems, business intelligence systems, and university\/R&D center systems, among others. As a consequence, every system must be capable of filtering the information received in order to extract the data it needs, so as to not compromise their processing, thus avoiding the risk of producing adulterated results. \n<\/p><p>Finally, from a technological point of view, it is mandatory to have a high-performance IT infrastructure on which to rely for the generation, storing, processing, and exchange of large data volumes in a quick and efficient way. Luckily, hardware, software, and communications solutions have experienced a huge progress in recent years, so technological viability is hardly an obstacle nowadays.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Some_additional_issues_and_barriers\">Some additional issues and barriers<\/span><\/h3>\n<p>The implementation of big data solutions and tools in the healthcare field requires addressing not only the organizational and technological issues detailed above, but also several legal and ethical questions. \n<\/p><p>From a legal point of view, the first cause of conflict may be data propriety. As explained in previous sections, every data properly processed and analyzed can be turned into knowledge, and the latter can be easily made profitable. The first companies working this angle are tech giants such as Google, which provides personalized advertisements based on navigation and search history, and Facebook, which admitted to focus part of its efforts on sociological research based on its users\u2019 data, and has even tried to take possession of these information in a completely unilateral way.<sup id=\"rdp-ebb-cite_ref-Mart.C3.ADnezGesti.C3.B3n14_7-1\" class=\"reference\"><a href=\"#cite_note-Mart.C3.ADnezGesti.C3.B3n14-7\" rel=\"external_link\">[7]<\/a><\/sup>\n<\/p><p>Given that the generation, registry, and processing of all this information requires a powerful hardware and software infrastructure, and therefore a large investment by these companies, their intention to make it profitable may be considered legitimate to a certain point, especially if they are not charging users for the service provided. However, limitations regarding the use of the stored data must be clearly established, something that seems to be far from being solved with the current legal framework, which is quite confusing. For instance, in the case of Spain, this framework combines European Union, national, regional, and sectorial (both health and e-government) regulations.<sup id=\"rdp-ebb-cite_ref-Mart.C3.ADnezGesti.C3.B3n14_7-2\" class=\"reference\"><a href=\"#cite_note-Mart.C3.ADnezGesti.C3.B3n14-7\" rel=\"external_link\">[7]<\/a><\/sup> Moreover, most of this legislation is outdated to a large extent, since it was passed in a time when IT progress was nowhere near what we have today.<sup id=\"rdp-ebb-cite_ref-CarniceroLaExplot16-3_16-0\" class=\"reference\"><a href=\"#cite_note-CarniceroLaExplot16-3-16\" rel=\"external_link\">[16]<\/a><\/sup>\n<\/p><p>Once at this point, a revision of these regulations, taking into account the current potential of big data solutions, as well as the foreseeable one on the short and mid-term, seems to be more than appropriate. Of course, this revision must be addressed with the goal of balancing the individual interests of patients (right to privacy) and professionals (legal certainty in the performance of their healthcare and management duties), as well as the general interests of society (research, education, or improvement of healthcare services, among others). To that end, protocols must be defined that combine both <i>a priori<\/i> measures, such as data anonymization, and <i>a posteriori<\/i> measures, such as thorough audits regarding the access and use of data. Keeping the human factor in mind, one of the most crucial <i>a priori<\/i> measures will always be raising the awareness of patients, professionals, and organizations.\n<\/p><p>From an ethical point of view, quite a few similarities to the legal field can be observed. The fact that IT is going to play an increasingly important role in health systems seems to be widely accepted, since its potential as a key instrument for the transformation of the current model is appreciated. Nevertheless, there is also a great concern about the lack of transparency in the management of the large amounts of data guarded by healthcare organizations. For this reason, the promotion of more and better control measures is backed by bioethics experts, starting with the development of a specific legal framework that can be turned into clear and visible actions, thus transmitting a sense of security and contributing to promote the trust in healthcare data mining.<sup id=\"rdp-ebb-cite_ref-CarniceroLaExplot16-2_15-1\" class=\"reference\"><a href=\"#cite_note-CarniceroLaExplot16-2-15\" rel=\"external_link\">[15]<\/a><\/sup>\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Big_data_in_SERGAS:_A_case_study\">Big data in SERGAS: A case study<\/span><\/h2>\n<p>Within the Spanish National Health System, healthcare is accountable to the Autonomous Communities, which represent the regional level of the government, and each has a health service. In the case of the community of Galicia, this would be the Galician Health Service (Servizo Galego de Sa\u00fade, SERGAS).\n<\/p><p>SERGAS relies on a business intelligence (BI) solution for the exploitation of structured datasets, these being provided by a regional database in which information supplied by the different hospitals and primary care centers of this health service is aggregated. In addition, a management system for information related to human resources and pharmaceutical expenditure provides structured data as well.\n<\/p><p>In order to complement this BI system, SERGAS has implemented big data technologies so as to exploit unstructured data stored in the patients\u2019 electronic health records. This innovation makes SERGAS the first Spanish health service to use big data in a systematic way. On a total budget of 982,278 euros, several projects have been developed regarding the following lines of action:\n<\/p><p><b>Rare disease management<\/b>\n<\/p>\n<ul><li> detection of suspicious cases<\/li>\n<li> creation of a rare diseases registry<\/li><\/ul>\n<p><b>Chronic disease management<\/b>\n<\/p>\n<ul><li> detection of diabetes mellitus type 2 patients, chronic obstructive pulmonary disease patients, and patients with pluripathology, yet uncategorized as such in their health records<\/li>\n<li> calculation of prevalence and incidence indicators, as well as risk factors<\/li><\/ul>\n<p><b>Clinical research<\/b>\n<\/p>\n<ul><li> decision-making support regarding the selection of the most appropriate kind of vascular endoprostheses (stents)<\/li><\/ul>\n<p><b>Nosocomial infection management<\/b>\n\u2022 research and categorization of detected cases\n\u2022 automated alerts\n<\/p><p><b>Surveillance of several syndromes<\/b>\n\u2022 case identification\n\u2022 detection of food toxi-infection and acute respiratory symptom outbreaks\n<\/p><p><b>Exploitation of lab test results<\/b> \n<\/p>\n<ul><li> currently in progress<\/li><\/ul>\n<p>As a whole, these systems are handling information belonging to 2.9 million patients, provided by 63 different data sources. As of 2016, 59 million normalized events have been compiled, 12 million documents (of 50 different kinds) have been semantically processed, and 500,000 cases have been detected. \n<\/p><p>Regarding information security, SERGAS applies a set of corporate criteria, with standard measures such as the definition of user profiles and access authorization levels, the anonymization of aggregate data, and the performance of audits to verify regulation compliance. Additionally, there are several committees that define the guidelines for the management of ethics and governance, always within the current legal framework.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Conclusions\">Conclusions<\/span><\/h2>\n<p>As with IT in general, the successful implementation of big data solutions in a healthcare environment will depend on their capability to generate added value that benefits patients, professionals, and organizations. No one seems to doubt the need to improve public health systems by evolving their current model, or the potentially valuable contributions of big data in this respect, but the great complexity that characterizes the implementation of these kinds of tools seems to be proven too, according to the requirements, and, in some cases, obstacles of a different nature that must be dealt with.\n<\/p><p>Once technological viability is apparently achieved, it is time for healthcare organizations and authorities to face the challenge of studying the possibilities of big data and seeking the best way of applying it to the solution of their issues, problems, and needs. In order to achieve this, they must not start wondering what information they have now and what they can achieve with it, instead focusing on what information they need and how they can get it. The most frequent problem will not be the availability of the necessary data, but the screening of the relevant information and how to assess it. In summary, the most important thing is not having the data, since this is already happening, but being able to ask the right questions at the right moment, process them to provide only the necessary and relevant information, and show the latter to healthcare professionals in such a way that they can assimilate it in a quick, correct, and easy manner in order to make the right decisions at the right time.\n<\/p><p>On the healthcare side, big data must become the foundation of clinical decision-making support systems, and also an instrument for data aggregation concerning public health departments, as well as research and education. On the management side, managers will have access to more accurate and timely knowledge concerning the real status of their organizations and adopt a proactive plan instead of a retrospective one. In addition, they will be capable of detecting deviations from objectives earlier and applying the appropriate corrective and, preferably, preventive measures. \n<\/p><p>In conclusion, the implementation of big data must be one of the main instruments for change in the current health system model, changing it into one with improved effectiveness and efficiency, taking into account both healthcare and economic outcomes of health services, thus being meaningful to patients and also to society, all while taking advantage of patients\u2019 potential as active participants in their own care.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Acknowledgements\">Acknowledgements<\/span><\/h2>\n<p>The authors wish to thank Ms. Pilar Carnicero and Mr. Guillermo V\u00e1zquez for their contributions in elaborating on and improving this article.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"References\">References<\/span><\/h2>\n<div class=\"reflist references-column-width\" style=\"-moz-column-width: 30em; -webkit-column-width: 30em; column-width: 30em; list-style-type: decimal;\">\n<ol class=\"references\">\n<li id=\"cite_note-WHOEvery07-1\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WHOEvery07_1-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">World Health Organization (2007). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.who.int\/iris\/handle\/10665\/43918\" target=\"_blank\"><i>Everybody's business -- Strengthening health systems to improve health outcomes: WHO's framework for action<\/i><\/a>. World Health Organization. pp. 44. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a> 9789241596077<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.who.int\/iris\/handle\/10665\/43918\" target=\"_blank\">http:\/\/www.who.int\/iris\/handle\/10665\/43918<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=book&rft.btitle=Everybody%27s+business+--+Strengthening+health+systems+to+improve+health+outcomes%3A+WHO%27s+framework+for+action&rft.aulast=World+Health+Organization&rft.au=World+Health+Organization&rft.date=2007&rft.pages=pp.%26nbsp%3B44&rft.pub=World+Health+Organization&rft.isbn=9789241596077&rft_id=http%3A%2F%2Fwww.who.int%2Firis%2Fhandle%2F10665%2F43918&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_and_public_health_systems:_Issues_and_opportunities\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WHOTallin08-2\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WHOTallin08_2-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.euro.who.int\/en\/publications\/policy-documents\/tallinn-charter-health-systems-for-health-and-wealth\" target=\"_blank\">\"The Tallinn Charter: Health Systems for Health and Wealth\"<\/a>. World Health Organization. 27 June 2008<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.euro.who.int\/en\/publications\/policy-documents\/tallinn-charter-health-systems-for-health-and-wealth\" target=\"_blank\">http:\/\/www.euro.who.int\/en\/publications\/policy-documents\/tallinn-charter-health-systems-for-health-and-wealth<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=The+Tallinn+Charter%3A+Health+Systems+for+Health+and+Wealth&rft.atitle=&rft.date=27+June+2008&rft.pub=World+Health+Organization&rft_id=http%3A%2F%2Fwww.euro.who.int%2Fen%2Fpublications%2Fpolicy-documents%2Ftallinn-charter-health-systems-for-health-and-wealth&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_and_public_health_systems:_Issues_and_opportunities\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-RojasAModel15-3\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-RojasAModel15_3-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Rojas, D.; Carnicero, J. (2015). <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.novapublishers.com\/catalog\/product_info.php?products_id=52835\" target=\"_blank\">\"A Model Of Information System For Healthcare: Global Vision and Integrated Data Flows\"<\/a>. In Berhardt, L.V.. <i>Advances in Medicine and Biology<\/i>. <b>82<\/b>. Nova Science Publishers. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a> 9781634636339<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.novapublishers.com\/catalog\/product_info.php?products_id=52835\" target=\"_blank\">https:\/\/www.novapublishers.com\/catalog\/product_info.php?products_id=52835<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=A+Model+Of+Information+System+For+Healthcare%3A+Global+Vision+and+Integrated+Data+Flows&rft.atitle=Advances+in+Medicine+and+Biology&rft.aulast=Rojas%2C+D.%3B+Carnicero%2C+J.&rft.au=Rojas%2C+D.%3B+Carnicero%2C+J.&rft.date=2015&rft.volume=82&rft.pub=Nova+Science+Publishers&rft.isbn=9781634636339&rft_id=https%3A%2F%2Fwww.novapublishers.com%2Fcatalog%2Fproduct_info.php%3Fproducts_id%3D52835&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_and_public_health_systems:_Issues_and_opportunities\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CarniceroLaExplot16-4\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CarniceroLaExplot16_4-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Carnicero, J.; Rojas, D.; Gonz\u00e1lez, A. et al. (2016) (PDF). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.seis.es\/documentos\/Informe%20La%20explotacion%20de%20datos%20de%20Salud\/LA%20EXPLOTACI%C3%93N%20DE%20DATOS%20DE%20SALUD.pdf\" target=\"_blank\"><i>La explotaci\u00f3n de datos de salud: Retos, oportunidades y l\u00edmites<\/i><\/a>. Sociedad Espa\u00f1ola de Inform\u00e1tica de la Salud. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a> 9788460889472<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.seis.es\/documentos\/Informe%20La%20explotacion%20de%20datos%20de%20Salud\/LA%20EXPLOTACI%C3%93N%20DE%20DATOS%20DE%20SALUD.pdf\" target=\"_blank\">http:\/\/www.seis.es\/documentos\/Informe%20La%20explotacion%20de%20datos%20de%20Salud\/LA%20EXPLOTACI%C3%93N%20DE%20DATOS%20DE%20SALUD.pdf<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=book&rft.btitle=La+explotaci%C3%B3n+de+datos+de+salud%3A+Retos%2C+oportunidades+y+l%C3%ADmites&rft.aulast=Carnicero%2C+J.%3B+Rojas%2C+D.%3B+Gonz%C3%A1lez%2C+A.+et+al.&rft.au=Carnicero%2C+J.%3B+Rojas%2C+D.%3B+Gonz%C3%A1lez%2C+A.+et+al.&rft.date=2016&rft.pub=Sociedad+Espa%C3%B1ola+de+Inform%C3%A1tica+de+la+Salud&rft.isbn=9788460889472&rft_id=http%3A%2F%2Fwww.seis.es%2Fdocumentos%2FInforme%2520La%2520explotacion%2520de%2520datos%2520de%2520Salud%2FLA%2520EXPLOTACI%25C3%2593N%2520DE%2520DATOS%2520DE%2520SALUD.pdf&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_and_public_health_systems:_Issues_and_opportunities\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HansenBigData14-5\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-HansenBigData14_5-0\" rel=\"external_link\">5.0<\/a><\/sup> <sup><a href=\"#cite_ref-HansenBigData14_5-1\" rel=\"external_link\">5.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Hansen, M.M.; Miron-Shatz, T.; Lau, A.Y.; Paton, C. (2014). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4287084\" target=\"_blank\">\"Big Data in Science and Healthcare: A Review of Recent Literature and Perspectives - Contribution of the IMIA Social Media Working Group\"<\/a>. <i>Yearbook of Medical Informatics<\/i> <b>9<\/b>: 21\u20136. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.15265%2FIY-2014-0004\" target=\"_blank\">10.15265\/IY-2014-0004<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4287084\/\" target=\"_blank\">PMC4287084<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/25123717\" target=\"_blank\">25123717<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4287084\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4287084<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Big+Data+in+Science+and+Healthcare%3A+A+Review+of+Recent+Literature+and+Perspectives+-+Contribution+of+the+IMIA+Social+Media+Working+Group&rft.jtitle=Yearbook+of+Medical+Informatics&rft.aulast=Hansen%2C+M.M.%3B+Miron-Shatz%2C+T.%3B+Lau%2C+A.Y.%3B+Paton%2C+C.&rft.au=Hansen%2C+M.M.%3B+Miron-Shatz%2C+T.%3B+Lau%2C+A.Y.%3B+Paton%2C+C.&rft.date=2014&rft.volume=9&rft.pages=21%E2%80%936&rft_id=info:doi\/10.15265%2FIY-2014-0004&rft_id=info:pmc\/PMC4287084&rft_id=info:pmid\/25123717&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4287084&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_and_public_health_systems:_Issues_and_opportunities\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-PorterTheStrat13-6\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-PorterTheStrat13_6-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Porter, M.E.; Lee, T.H. (2013). <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/hbr.org\/2013\/10\/the-strategy-that-will-fix-health-care\" target=\"_blank\">\"The Strategy That Will Fix Health Care\"<\/a>. <i>Harvard Business Review<\/i> <b>10<\/b>: 50\u201370<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/hbr.org\/2013\/10\/the-strategy-that-will-fix-health-care\" target=\"_blank\">https:\/\/hbr.org\/2013\/10\/the-strategy-that-will-fix-health-care<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+Strategy+That+Will+Fix+Health+Care&rft.jtitle=Harvard+Business+Review&rft.aulast=Porter%2C+M.E.%3B+Lee%2C+T.H.&rft.au=Porter%2C+M.E.%3B+Lee%2C+T.H.&rft.date=2013&rft.volume=10&rft.pages=50%E2%80%9370&rft_id=https%3A%2F%2Fhbr.org%2F2013%2F10%2Fthe-strategy-that-will-fix-health-care&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_and_public_health_systems:_Issues_and_opportunities\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-Mart.C3.ADnezGesti.C3.B3n14-7\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-Mart.C3.ADnezGesti.C3.B3n14_7-0\" rel=\"external_link\">7.0<\/a><\/sup> <sup><a href=\"#cite_ref-Mart.C3.ADnezGesti.C3.B3n14_7-1\" rel=\"external_link\">7.1<\/a><\/sup> <sup><a href=\"#cite_ref-Mart.C3.ADnezGesti.C3.B3n14_7-2\" rel=\"external_link\">7.2<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation book\">Mart\u00ednez, R., Rojas, D. (2014). <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/repositorio.cepal.org\/handle\/11362\/37058\" target=\"_blank\">\"Gesti\u00f3n de la seguridad de la informaci\u00f3n en atenci\u00f3n primaria y uso responsable de Internet y de las redes sociales\"<\/a>. In Carnicero, J.; Fern\u00e1ndez, A.; Rojas de la Escalera, D.. <i>Manual de salud electr\u00f3nica para directivos de servicios y sistemas de salud<\/i>. <b>2<\/b>. United Nations<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/repositorio.cepal.org\/handle\/11362\/37058\" target=\"_blank\">https:\/\/repositorio.cepal.org\/handle\/11362\/37058<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Gesti%C3%B3n+de+la+seguridad+de+la+informaci%C3%B3n+en+atenci%C3%B3n+primaria+y+uso+responsable+de+Internet+y+de+las+redes+sociales&rft.atitle=Manual+de+salud+electr%C3%B3nica+para+directivos+de+servicios+y+sistemas+de+salud&rft.aulast=Mart%C3%ADnez%2C+R.%2C+Rojas%2C+D.&rft.au=Mart%C3%ADnez%2C+R.%2C+Rojas%2C+D.&rft.date=2014&rft.volume=2&rft.pub=United+Nations&rft_id=https%3A%2F%2Frepositorio.cepal.org%2Fhandle%2F11362%2F37058&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_and_public_health_systems:_Issues_and_opportunities\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BelleBigData15-8\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-BelleBigData15_8-0\" rel=\"external_link\">8.0<\/a><\/sup> <sup><a href=\"#cite_ref-BelleBigData15_8-1\" rel=\"external_link\">8.1<\/a><\/sup> <sup><a href=\"#cite_ref-BelleBigData15_8-2\" rel=\"external_link\">8.2<\/a><\/sup> <sup><a href=\"#cite_ref-BelleBigData15_8-3\" rel=\"external_link\">8.3<\/a><\/sup> <sup><a href=\"#cite_ref-BelleBigData15_8-4\" rel=\"external_link\">8.4<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Belle, A.; Thiagarajan, R.; Soroushmehr, S.M.R. et al. (2015). \"Big data analytics in healthcare\". <i>BioMed Research International<\/i> <b>2015<\/b> (2015): 370194. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1155%2F2015%2F370194\" target=\"_blank\">10.1155\/2015\/370194<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Big+data+analytics+in+healthcare&rft.jtitle=BioMed+Research+International&rft.aulast=Belle%2C+A.%3B+Thiagarajan%2C+R.%3B+Soroushmehr%2C+S.M.R.+et+al.&rft.au=Belle%2C+A.%3B+Thiagarajan%2C+R.%3B+Soroushmehr%2C+S.M.R.+et+al.&rft.date=2015&rft.volume=2015&rft.issue=2015&rft.pages=370194&rft_id=info:doi\/10.1155%2F2015%2F370194&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_and_public_health_systems:_Issues_and_opportunities\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LuoBigData16-9\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-LuoBigData16_9-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Luo, J.; Wu, M.; Gopukumar, D.; Zhao, Y. (2016). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4720168\" target=\"_blank\">\"Big Data Application in Biomedical Research and Health Care: A Literature Review\"<\/a>. <i>Biomedical Informatics Insights<\/i> <b>8<\/b>: 1\u201310. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.4137%2FBII.S31559\" target=\"_blank\">10.4137\/BII.S31559<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4720168\/\" target=\"_blank\">PMC4720168<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/26843812\" target=\"_blank\">26843812<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4720168\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4720168<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Big+Data+Application+in+Biomedical+Research+and+Health+Care%3A+A+Literature+Review&rft.jtitle=Biomedical+Informatics+Insights&rft.aulast=Luo%2C+J.%3B+Wu%2C+M.%3B+Gopukumar%2C+D.%3B+Zhao%2C+Y.&rft.au=Luo%2C+J.%3B+Wu%2C+M.%3B+Gopukumar%2C+D.%3B+Zhao%2C+Y.&rft.date=2016&rft.volume=8&rft.pages=1%E2%80%9310&rft_id=info:doi\/10.4137%2FBII.S31559&rft_id=info:pmc\/PMC4720168&rft_id=info:pmid\/26843812&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4720168&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_and_public_health_systems:_Issues_and_opportunities\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-NASEMBigData16-10\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-NASEMBigData16_10-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">National Academies of Sciences, Engineering, and Medicine (2016). <i>Big Data and Analytics for Infectious Disease Research, Operations, and Policy: Proceedings of a Workshop<\/i>. The National Academies Press. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.17226%2F23654\" target=\"_blank\">10.17226\/23654<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a> 9780309450140.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=book&rft.btitle=Big+Data+and+Analytics+for+Infectious+Disease+Research%2C+Operations%2C+and+Policy%3A+Proceedings+of+a+Workshop&rft.aulast=National+Academies+of+Sciences%2C+Engineering%2C+and+Medicine&rft.au=National+Academies+of+Sciences%2C+Engineering%2C+and+Medicine&rft.date=2016&rft.pub=The+National+Academies+Press&rft_id=info:doi\/10.17226%2F23654&rft.isbn=9780309450140&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_and_public_health_systems:_Issues_and_opportunities\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-PanahiazarEmpower14-11\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-PanahiazarEmpower14_11-0\" rel=\"external_link\">11.0<\/a><\/sup> <sup><a href=\"#cite_ref-PanahiazarEmpower14_11-1\" rel=\"external_link\">11.1<\/a><\/sup> <sup><a href=\"#cite_ref-PanahiazarEmpower14_11-2\" rel=\"external_link\">11.2<\/a><\/sup> <sup><a href=\"#cite_ref-PanahiazarEmpower14_11-3\" rel=\"external_link\">11.3<\/a><\/sup> <sup><a href=\"#cite_ref-PanahiazarEmpower14_11-4\" rel=\"external_link\">11.4<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Panahiazar, M.; Taslimitehrani, V.; Jadhav, A.; Pathak, J. (2014). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4333680\" target=\"_blank\">\"Empowering Personalized Medicine with Big Data and Semantic Web Technology: Promises, Challenges, and Use Cases\"<\/a>. <i>Proceedings of the IEEE International Conference on Big Data<\/i> <b>2014<\/b>: 790\u20135. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FBigData.2014.7004307\" target=\"_blank\">10.1109\/BigData.2014.7004307<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4333680\/\" target=\"_blank\">PMC4333680<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/25705726\" target=\"_blank\">25705726<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4333680\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4333680<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Empowering+Personalized+Medicine+with+Big+Data+and+Semantic+Web+Technology%3A+Promises%2C+Challenges%2C+and+Use+Cases&rft.jtitle=Proceedings+of+the+IEEE+International+Conference+on+Big+Data&rft.aulast=Panahiazar%2C+M.%3B+Taslimitehrani%2C+V.%3B+Jadhav%2C+A.%3B+Pathak%2C+J.&rft.au=Panahiazar%2C+M.%3B+Taslimitehrani%2C+V.%3B+Jadhav%2C+A.%3B+Pathak%2C+J.&rft.date=2014&rft.volume=2014&rft.pages=790%E2%80%935&rft_id=info:doi\/10.1109%2FBigData.2014.7004307&rft_id=info:pmc\/PMC4333680&rft_id=info:pmid\/25705726&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4333680&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_and_public_health_systems:_Issues_and_opportunities\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-KuziemskyBigData14-12\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-KuziemskyBigData14_12-0\" rel=\"external_link\">12.0<\/a><\/sup> <sup><a href=\"#cite_ref-KuziemskyBigData14_12-1\" rel=\"external_link\">12.1<\/a><\/sup> <sup><a href=\"#cite_ref-KuziemskyBigData14_12-2\" rel=\"external_link\">12.2<\/a><\/sup> <sup><a href=\"#cite_ref-KuziemskyBigData14_12-3\" rel=\"external_link\">12.3<\/a><\/sup> <sup><a href=\"#cite_ref-KuziemskyBigData14_12-4\" rel=\"external_link\">12.4<\/a><\/sup> <sup><a href=\"#cite_ref-KuziemskyBigData14_12-5\" rel=\"external_link\">12.5<\/a><\/sup> <sup><a href=\"#cite_ref-KuziemskyBigData14_12-6\" rel=\"external_link\">12.6<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Kuziemsky, C.E.; Monkman, H.; Petersen, C. et al. (2014). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4287094\" target=\"_blank\">\"Big Data in Healthcare - Defining the Digital Persona through User Contexts from the Micro to the Macro. Contribution of the IMIA Organizational and Social Issues WG\"<\/a>. <i>Yearbook of Medical Informatics<\/i> <b>2014<\/b>: 82\u20139. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.15265%2FIY-2014-0014\" target=\"_blank\">10.15265\/IY-2014-0014<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4287094\/\" target=\"_blank\">PMC4287094<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/25123726\" target=\"_blank\">25123726<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4287094\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4287094<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Big+Data+in+Healthcare+-+Defining+the+Digital+Persona+through+User+Contexts+from+the+Micro+to+the+Macro.+Contribution+of+the+IMIA+Organizational+and+Social+Issues+WG&rft.jtitle=Yearbook+of+Medical+Informatics&rft.aulast=Kuziemsky%2C+C.E.%3B+Monkman%2C+H.%3B+Petersen%2C+C.+et+al.&rft.au=Kuziemsky%2C+C.E.%3B+Monkman%2C+H.%3B+Petersen%2C+C.+et+al.&rft.date=2014&rft.volume=2014&rft.pages=82%E2%80%939&rft_id=info:doi\/10.15265%2FIY-2014-0014&rft_id=info:pmc\/PMC4287094&rft_id=info:pmid\/25123726&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4287094&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_and_public_health_systems:_Issues_and_opportunities\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ManyikaBigData11-13\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ManyikaBigData11_13-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Manyika, J.; Chui, M.; Brown, B. et al. (May 2011). <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.mckinsey.com\/business-functions\/digital-mckinsey\/our-insights\/big-data-the-next-frontier-for-innovation\" target=\"_blank\">\"Big data: The next frontier for innovation, competition, and productivity\"<\/a>. McKinsey Global Institute<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.mckinsey.com\/business-functions\/digital-mckinsey\/our-insights\/big-data-the-next-frontier-for-innovation\" target=\"_blank\">https:\/\/www.mckinsey.com\/business-functions\/digital-mckinsey\/our-insights\/big-data-the-next-frontier-for-innovation<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Big+data%3A+The+next+frontier+for+innovation%2C+competition%2C+and+productivity&rft.atitle=&rft.aulast=Manyika%2C+J.%3B+Chui%2C+M.%3B+Brown%2C+B.+et+al.&rft.au=Manyika%2C+J.%3B+Chui%2C+M.%3B+Brown%2C+B.+et+al.&rft.date=May+2011&rft.pub=McKinsey+Global+Institute&rft_id=https%3A%2F%2Fwww.mckinsey.com%2Fbusiness-functions%2Fdigital-mckinsey%2Four-insights%2Fbig-data-the-next-frontier-for-innovation&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_and_public_health_systems:_Issues_and_opportunities\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CarniceroLasTIC16-14\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CarniceroLasTIC16_14-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Carnicero, J.; Rojas, D.; Mart\u00ednez, R. et al. (2016) (PDF). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.seis.es\/documentos\/XI%20InformeSeis.pdf\" target=\"_blank\"><i>XI Informe SEIS - Las TIC y la seguridad de los pacientes: Primum non nocere<\/i><\/a>. Sociedad Espa\u00f1ola de Inform\u00e1tica de la Salud. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a> 9788461740246<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.seis.es\/documentos\/XI%20InformeSeis.pdf\" target=\"_blank\">http:\/\/www.seis.es\/documentos\/XI%20InformeSeis.pdf<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=book&rft.btitle=XI+Informe+SEIS+-+Las+TIC+y+la+seguridad+de+los+pacientes%3A+Primum+non+nocere&rft.aulast=Carnicero%2C+J.%3B+Rojas%2C+D.%3B+Mart%C3%ADnez%2C+R.+et+al.&rft.au=Carnicero%2C+J.%3B+Rojas%2C+D.%3B+Mart%C3%ADnez%2C+R.+et+al.&rft.date=2016&rft.pub=Sociedad+Espa%C3%B1ola+de+Inform%C3%A1tica+de+la+Salud&rft.isbn=9788461740246&rft_id=http%3A%2F%2Fwww.seis.es%2Fdocumentos%2FXI%2520InformeSeis.pdf&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_and_public_health_systems:_Issues_and_opportunities\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CarniceroLaExplot16-2-15\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-CarniceroLaExplot16-2_15-0\" rel=\"external_link\">15.0<\/a><\/sup> <sup><a href=\"#cite_ref-CarniceroLaExplot16-2_15-1\" rel=\"external_link\">15.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation book\">Le\u00f3n, P. (2016). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.seis.es\/documentos\/Informe%20La%20explotacion%20de%20datos%20de%20Salud\/LA%20EXPLOTACI%C3%93N%20DE%20DATOS%20DE%20SALUD.pdf\" target=\"_blank\">\"Cap\u00edtulo III: Bio\u00e9tica y explotaci\u00f3n de grandes conjuntos de datos\"<\/a>. In Carnicero, J.; Rojas, D.; Gonz\u00e1lez, A. et al. (PDF). <i>La explotaci\u00f3n de datos de salud: Retos, oportunidades y l\u00edmites<\/i>. Sociedad Espa\u00f1ola de Inform\u00e1tica de la Salud. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a> 9788460889472<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.seis.es\/documentos\/Informe%20La%20explotacion%20de%20datos%20de%20Salud\/LA%20EXPLOTACI%C3%93N%20DE%20DATOS%20DE%20SALUD.pdf\" target=\"_blank\">http:\/\/www.seis.es\/documentos\/Informe%20La%20explotacion%20de%20datos%20de%20Salud\/LA%20EXPLOTACI%C3%93N%20DE%20DATOS%20DE%20SALUD.pdf<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Cap%C3%ADtulo+III%3A+Bio%C3%A9tica+y+explotaci%C3%B3n+de+grandes+conjuntos+de+datos&rft.atitle=La+explotaci%C3%B3n+de+datos+de+salud%3A+Retos%2C+oportunidades+y+l%C3%ADmites&rft.aulast=Le%C3%B3n%2C+P.&rft.au=Le%C3%B3n%2C+P.&rft.date=2016&rft.pub=Sociedad+Espa%C3%B1ola+de+Inform%C3%A1tica+de+la+Salud&rft.isbn=9788460889472&rft_id=http%3A%2F%2Fwww.seis.es%2Fdocumentos%2FInforme%2520La%2520explotacion%2520de%2520datos%2520de%2520Salud%2FLA%2520EXPLOTACI%25C3%2593N%2520DE%2520DATOS%2520DE%2520SALUD.pdf&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_and_public_health_systems:_Issues_and_opportunities\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CarniceroLaExplot16-3-16\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CarniceroLaExplot16-3_16-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">And\u00e9rez, A. (2016). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.seis.es\/documentos\/Informe%20La%20explotacion%20de%20datos%20de%20Salud\/LA%20EXPLOTACI%C3%93N%20DE%20DATOS%20DE%20SALUD.pdf\" target=\"_blank\">\"Cap\u00edtulo IV: Disposiciones legales aplicables\"<\/a>. In Carnicero, J.; Rojas, D.; Gonz\u00e1lez, A. et al. (PDF). <i>La explotaci\u00f3n de datos de salud: Retos, oportunidades y l\u00edmites<\/i>. Sociedad Espa\u00f1ola de Inform\u00e1tica de la Salud. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a> 9788460889472<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.seis.es\/documentos\/Informe%20La%20explotacion%20de%20datos%20de%20Salud\/LA%20EXPLOTACI%C3%93N%20DE%20DATOS%20DE%20SALUD.pdf\" target=\"_blank\">http:\/\/www.seis.es\/documentos\/Informe%20La%20explotacion%20de%20datos%20de%20Salud\/LA%20EXPLOTACI%C3%93N%20DE%20DATOS%20DE%20SALUD.pdf<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Cap%C3%ADtulo+IV%3A+Disposiciones+legales+aplicables&rft.atitle=La+explotaci%C3%B3n+de+datos+de+salud%3A+Retos%2C+oportunidades+y+l%C3%ADmites&rft.aulast=And%C3%A9rez%2C+A.&rft.au=And%C3%A9rez%2C+A.&rft.date=2016&rft.pub=Sociedad+Espa%C3%B1ola+de+Inform%C3%A1tica+de+la+Salud&rft.isbn=9788460889472&rft_id=http%3A%2F%2Fwww.seis.es%2Fdocumentos%2FInforme%2520La%2520explotacion%2520de%2520datos%2520de%2520Salud%2FLA%2520EXPLOTACI%25C3%2593N%2520DE%2520DATOS%2520DE%2520SALUD.pdf&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_and_public_health_systems:_Issues_and_opportunities\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<\/ol><\/div>\n<h2><span class=\"mw-headline\" id=\"Notes\">Notes<\/span><\/h2>\n<p>This presentation is faithful to the original, with only a few minor changes to grammar, spelling, and presentation, including the addition of PMCID and DOI when they were missing from the original reference. Citation three is listed in the references of the original but inadvertently omitted from the inline citations; it has been placed in the text at what is believed to be the appropriate citation point. Additionally, the original document placed citations 11 and 12 before nine and 10; this version shows citations in order of appearance, by design.\n<\/p>\n<!-- \nNewPP limit report\nCached time: 20181214185739\nCache expiry: 86400\nDynamic content: false\nCPU time usage: 0.451 seconds\nReal time usage: 0.484 seconds\nPreprocessor visited node count: 13962\/1000000\nPreprocessor generated node count: 34994\/1000000\nPost\u2010expand include size: 117163\/2097152 bytes\nTemplate argument size: 44042\/2097152 bytes\nHighest expansion depth: 18\/40\nExpensive parser function count: 0\/100\n-->\n\n<!-- \nTransclusion expansion time report (%,ms,calls,template)\n100.00% 458.829 1 - -total\n 79.42% 364.390 1 - Template:Reflist\n 67.06% 307.698 16 - Template:Citation\/core\n 38.50% 176.651 8 - Template:Cite_book\n 26.87% 123.283 6 - Template:Cite_journal\n 13.85% 63.530 1 - Template:Infobox_journal_article\n 13.28% 60.934 1 - Template:Infobox\n 8.01% 36.730 80 - Template:Infobox\/row\n 6.88% 31.572 2 - Template:Cite_web\n 6.61% 30.316 21 - Template:Citation\/identifier\n-->\n\n<!-- Saved in parser cache with key limswiki:pcache:idhash:10474-0!*!0!!en!*!* and timestamp 20181214185738 and revision id 32730\n -->\n<\/div><div class=\"printfooter\">Source: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Big_data_and_public_health_systems:_Issues_and_opportunities\">https:\/\/www.limswiki.org\/index.php\/Journal:Big_data_and_public_health_systems:_Issues_and_opportunities<\/a><\/div>\n\t\t\t\t\t\t\t\t\t\t<!-- end content -->\n\t\t\t\t\t\t\t\t\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t\t<!-- end of the left (by default at least) column -->\n\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t\t\n\t\t<\/div>\n\t\t\n\n<\/body>","b94dc07071fd3149fbecd75f93d73558_images":[],"b94dc07071fd3149fbecd75f93d73558_timestamp":1544813858,"75f1e35ff0bfbfc1a106a26c2f646394_type":"article","75f1e35ff0bfbfc1a106a26c2f646394_title":"Generating big data sets from knowledge-based decision support systems to pursue value-based healthcare (Gonz\u00e1lez-Ferrer et al. 2018)","75f1e35ff0bfbfc1a106a26c2f646394_url":"https:\/\/www.limswiki.org\/index.php\/Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare","75f1e35ff0bfbfc1a106a26c2f646394_plaintext":"\n\n\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\n\n\t\t\t\tJournal:Generating big data sets from knowledge-based decision support systems to pursue value-based healthcare\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tFrom LIMSWiki\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\tJump to: navigation, search\n\n\t\t\t\t\t\n\t\t\t\t\tFull article title\n \nGenerating big data sets from knowledge-based decision support systems to pursue value-based healthcareJournal\n \nInternational Journal of Interactive Multimedia and Artificial IntelligenceAuthor(s)\n \nGonz\u00e1lez-Ferrer, Arturo; Seara, Germ\u00e1n; Ch\u00e1fer, Joan; Mayol, JulioAuthor affiliation(s)\n \nInstituto de Investigaci\u00f3n Sanitaria San Carlos, Hospital Universitario Cl\u00ednico San CarlosPrimary contact\n \nEmail: arturogf at gmail dot comYear published\n \n2018Volume and issue\n \n4 (7)Page(s)\n \n42\u201346DOI\n \n10.9781\/ijimai.2017.03.006ISSN\n \n1989-1660Distribution license\n \nCreative Commons Attribution 3.0 UnportedWebsite\n \nhttp:\/\/www.ijimai.org\/journal\/node\/1626Download\n \nhttp:\/\/www.ijimai.org\/journal\/sites\/default\/files\/files\/2017\/03\/ijimai_4_7_6_pdf_12585.pdf (PDF)\n\nContents\n\n1 Abstract \n2 Introduction \n3 Knowledge-based decision support systems (KB-DSS) \n4 Innovative projects in HUCSC \n\n4.1 Computer-interpretable guideline for diagnosis and treatment of hyponatremia \n4.2 Unsupervised learning of discharge data (big data) \n4.3 Hikari: A case study of mental health (big data) \n4.4 Clinical data repository for secondary use \n\n\n5 Discussion \n6 Conclusion \n7 References \n8 Notes \n\n\n\nAbstract \nTalking about big data in healthcare we usually refer to how to use data collected from current electronic medical records, either structured or unstructured, to answer clinically relevant questions. This operation is typically carried out by means of analytics tools (e.g., machine learning) or by extracting relevant data from patient summaries through natural language processing techniques. From other perspectives of research in medical informatics, powerful initiatives have emerged to help physicians make decisions, in both diagnostics and therapeutics, built from existing medical evidence (i.e., knowledge-based decision support systems). Many of the problems these tools have shown, when used in real clinical settings, are related to their implementation and deployment, more than failing in their support; however, technology is slowly overcoming interoperability and integration issues. Beyond the point-of-care decision support these tools can provide, the data generated when using them, even in controlled trials, could be used to further analyze facts that are traditionally ignored in the current clinical practice. In this paper, we reflect on the technologies available to make the leap and how they could help drive healthcare organizations shifting to a value-based healthcare philosophy.\nKeywords: big data, DSS, e-health, knowledge management, management systems\n\nIntroduction \nHealthcare made a big step towards modernization with the emergence of the evidence-based medicine (EBM) concept in the late 1980s.[1] EBM is an approach to medical practice that aims to apply the best known scientific evidence into clinical decision-making regarding diagnosis and effective management of specific conditions and diseases. While the EBM concept was generally well received by care professionals, many factors \u2014 such as their daily work conditions or their high work load \u2014 affect putting into practice this approach in the expected way. A recent report from the Institute of Medicine in 2012 revealed that only 10 to 20 percent of the decisions clinicians make are evidence-based.[2] This fact reflects the need for medical practitioners, supported by their healthcare organizations, to make a shift in their behavior about the way clinical practice is currently carried out.\nThe idea of EBM emerged in very different conditions from the current scenario. An explosion of technical possibilities \u2014 in nearly thirty years \u2014 have come into place to help organizations take a more modern approach, providing them with support in this regard. Not only epidemiological research can drive EBM, but also new data-oriented approaches. When saying \u201cdata-oriented,\u201d we refer to data about the real daily clinical practice: how, when, why, and by whom are clinical actions carried out (or not), and what are the health results of those actions. Nonetheless, this might still be hampered by the current design of electronic medical records (EMRs) and by the role and focus that contemporary doctors should adopt. The use of EMRs by physicians could be insufficient, as recognized by studies[2] that expose that, even after post-digitalization of healthcare, they are not utilized to their maximum potential at all.\nThe fact that the EBM approach was crafted with the goal in mind of pursuing effectiveness in disease management left behind the consideration of organizational and human factors that are crucial in how decisions are truly made. By analyzing data generated by healthcare organizations we could yield information concerning the pitfalls that are hindering evidence-based clinical actions. At the same time, new evidence could be unveiled that is probably not considered in the current production of clinical practice guidelines (CPGs). For example, Toussi et al.[3] used data mining techniques to find out how physicians prescribe medications in diverse cases with various clinical conditions, in order to complement existing clinical guidelines where absence of enough evidence occur. Furthermore, specific training actions could be directed to address common failures detected in the management of medical conditions.\nTherefore, the problem that healthcare organizations are trying to solve, under the hypothesis that the \u201cbig data\u201d paradigm will change the way clinical practice is currently carried out, is how they can produce data that help to unveil real clinical behavior and \"mindlines,\"[4] linked with other organizational data (e.g., costs) and context information that could be behind their actions and decisions. Only making this analysis possible will they be able to change their philosophy to pursue and underpin value, beyond so-called effectiveness. And value here means detecting which actions \u2014 later possibly abstracted into policies \u2014 could really improve the behavior of the organizations and care professionals for the better care of their patients. \nIn this paper, we intend to reflect on some existing techniques beyond current electronic medical records (EMRs) that can help to generate such data sets, as well as considerations to be made, providing some examples of initiatives we are trying to push forward from the Innovation Unit of Hospital Universitario Cl\u00ednico San Carlos (HUCSC).\n\nKnowledge-based decision support systems (KB-DSS) \nIn 2007, Gartner[5] reported a five-stage evolution model for electronic health records (EHRs) where they established a path of characteristics, in terms of eight core capabilities, that EHRs should follow in order to provide the proper support to care professionals. Systems complying with Generation 3 requisites were supposed to be able to bring evidence-based medicine to the point of care, and theoretically coincide with the capabilities of most available EHRs, which progressed mainly through the core capabilities of system management, interoperability, and clinical data models, though there is still space for improvement today. Generation 4 was expected to improve the core capabilities of decision support, clinically relevant data analysis, presentation, and clinical workflow management.\nMore recently, Greenes offered his view about the past and future of knowledge-driven health IT[6], stating that current EHR systems were built for a model that is now old and even inappropriate, supported by proprietary infrastructures and knowledge content. He also mentioned the gradual increase in knowledge-based applications during the 2000s, with the creation of computer-interpretable clinical guideline formalisms like GLIF[7] and others.[8] By that time, these systems were having little penetration into real clinical settings, mostly due to the lack of pervasiveness of standards and the use of proprietary tools. Fortunately, this fact is something widely recognized by the current health IT community, and steps have been taken to tackle these problems. From requirements analysis of data standards[9] and development data integration mechanisms[10] for making DSS interoperable, the emergence of new lightweight web services standards like the HL7 Fast Healthcare Interoperability Resources (FHIR)[11], to substantial investments from public bodies that ended up with real deployments and piloting of patient guidance systems. A good example is MobiGuide[10][12], a project funded by the European Commission under the seventh framework program (FP7). Its goal was to create an intelligent KB-DSS to help physicians and patients taking the most appropriate decisions to manage concrete conditions (atrial fibrillation, gestational diabetes) using a back-end server and wearable sensors to monitor patients' status.\nIn this context, Figure 1 represents the architecture that represents our view, well aligned to positions already expressed by some research communities.[13] From top to bottom and left to right, physicians and epidemiologists develop CPGs that can be computerized, together with knowledge engineers, into computer-interpretable guideline (CIG) models. With the proper validation mechanisms, using data previously aggregated into clinical data repositories, these models can be trialed, after the corresponding integration into hospital information systems. The execution of CIG models can start generating data sets that are composed of acceptance or denials by physicians of recommendations (e.g., diagnosis, drug prescriptions, therapies, etc.) provided by the knowledge-based DSS developed, and treatments paths followed for different patient profiles. These paths can later on be analyzed by means of process mining techniques[14][15], unveiling common practices followed while using decision support and comparing the compliance of traditional clinical practice with the one recommended by the evidence-based DSS. At the same time, normalized clinical data repositories, while ensuring the quality of the data stored, can be used in the traditional view of machine learning and big data research.[16][17] The results could be complemented by comparing them with the output data sets of the KB-DSS. The output of the research could provide new evidence to be included in new versions of the CPGs (continuous improvement).\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 1: Architecture for the generation of evidence-based big data sets\n\n\n\nInnovative projects in HUCSC \nThe Innovation Unit of Hospital Universitario Cl\u00ednico San Carlos, being transversal to the healthcare institution, is intended to cover two main aspects of innovation, always pursuing to increase value. On the one hand, it is expected to help hospital professionals to get their research into the mainstream, when there is an opportunity for it. On the other hand, it maintains a technical department to develop innovative products and test their prototypes, driving the hospital to maximize the possibilities that technological solutions could provide, especially artificial intelligence-based tools. \nThe ultimate intention is to disseminate the existence of these techniques while facilitating its understanding, create a culture of innovation within the hospital, and, when possible, get external companies to finalize these prototypes (or collaborate in the development) if they are demonstrated relevant and close to market possibility. The following are several ongoing projects aligned with these goals and that contribute several methods and artifacts to the architecture presented.\n\nComputer-interpretable guideline for diagnosis and treatment of hyponatremia \nThe endocrinology department demanded a process-based solution to help new residents to improve their ability to diagnose and manage the hyponatremia condition (presenting low levels of serum sodium). Hyponatremia is the most frequent electrolyte disorder; however, according to some studies, it has proved to be very difficult to comprehend by physicians in general.[18] To address this project, we developed a CIG model[19][20] using the PROForma set of tools[21][22], covering the diagnosis of hyponatremia, classifying it into thirteen different subtypes. During a retrospective validation of the system with the data from 65 patients, we compared the system\u2019s output to the diagnosis consensus of two experts, obtaining a very high agreement (kappa=0.86). The agreement found was also higher than a previous experiment found in the literature[23], carried out by comparing the performance of a resident physician \u2014 using the original paper guideline \u2014 with the diagnosis of senior physicians. Nonetheless, the most relevant advance of using such a system, beyond its successful diagnosis performance, was the identification and recording of data cases that were contrary to the consensus of international hyponatremia experts, specifically regarding hypoaldosteronism, where concrete marker thresholds were thought to be associated to its diagnosis. The application of our model found several cases where this hypothesis did not apply, showing the lack of real evidence and the need for further research. This is a concrete demonstration of how putting into practice these knowledge-based systems can help detect where evidence is failing and focus new research directions.\n\nUnsupervised learning of discharge data (big data) \nThe syndrome of inappropriate antidiuretic hormone secretion (SIADH) represents around one-third of all cases of hyponatremia. We carried out a project[24] to identify clusters of hospitalized SIADH patients sharing diagnosed pathologies (comorbidities), where the results coincided and extended previous research identifying individual comorbidities.\nOur methods included testing of two different distance measures and hierarchical agglomerative clustering. We used similarity profile analysis for determination of the number of significant clusters and membership of individuals[25] (by means of the SIMPROF method included in the clustsig R package). The method provides also the members of each proposed cluster, where validation of the clusters produced is assessed by iteratively carrying out hundreds of permutations tests. Analyzing the data from around 650 patients, it unveiled eight clusters, with five of them being significant: cancer patients, urinary tract infection patients, patients with renal failure, patients with respiratory problems, and patients with atrial fibrillation and other heart conditions.\nWe found one main problem: this process is costly to carry out on a personal computer, especially when having thousands of columns (variables) in the data. We are evaluating the use of the Cloudera big data framework along with Apache Mahout[26] to build the next stage of scalable algorithms that are able to cope with big data sets. If successful, this should be accompanied by the deployment of a private cloud infrastructure[27] able to provide a machine learning as a service (MLaaS) platform, due to the characteristics of patient sensitive data.\n\nHikari: A case study of mental health (big data) \nIn June 2015, Fujitsu Laboratories of Europe Ltd. and Fujitsu EMEIA in Spain signed a strategic research collaboration agreement with the Foundation for Biomedical Research of Hospital Cl\u00ednico San Carlos (FIBHCSC). Mental health was selected as a key target for the initial project for several reasons: 1) the high levels of disability and morbidity associated to mental illness; 2) the important burden that mental illness imposes on patients, both at the individual and social level, and on the use of healthcare resources; and 3) the virtual impossibility to analyze results and its value, despite an apparently perfect design and theoretical structure of mental health services.[28]\nHikari, the Japanese word for light, is a part of Fujitsu\u2019s Zinrai Artificial Intelligence technologies focused on people that includes data analytics and semantic modeling. In this project we have used relevant dissociated clinical data from the psychiatric department, obtained during the last 10 years, including patient discharge records and the specific registries of psychiatric emergency care, in order to generate a very simple and friendly tool that allows clinicians to have access to information related to the main diagnosis, comorbidities, and associated health risks, and also the possibility of analysis at the population level. It has been also useful to track the pathways through the healthcare system followed by patients, and to analyze the impact on the use of resources and costs. \nAt the present time, the database includes approximately 30,000 emergency care records and 6,500 hospitalizations; however, we expect that by the time this paper will be published, it will include data from more than 370,000 outpatients and 38,000 records of day hospital care. This will help us to establish patterns of behavior of the different pathologies and conditions in terms of comorbidities, pathways, and use of resources.\n\nClinical data repository for secondary use \nHealth observatories, regardless of regional, national, or supranational level, rely on reporting data that will inform on healthcare structure and compliance with programs or pathways. However, data on health outcomes and results are very few or close to none. This is very closely related to the incoherence and fragmented evolution of health care information systems.\nIn the last decades it has become increasingly evident the demographic and social change in Western societies that has brought the concepts of chronicity, fragility, and complexity of patients. This makes the continuity of care centered on patients an absolute necessity if we are to keep our health systems sustainable. Probably one of the main factors involved in this kind of transformation is access to daily care data that will enable patients, professionals, managers, and health policy makers to address these challenges.\nIf we consider the previous lines, it becomes more and more evident the desirability of having repositories of relevant dissociated clinical data that will allow the evaluation of the procedures and results of the real clinical practice, to compare them with recommendations based on evidence, and, at the same time, to generate new evidence from the stored data. It is essential to standardize data structure, context (actors, themes, time), continuity of care (such as UNE-EN-13940), generic reference models (such as UNE-EN-ISO 13606, part1), understandable archetypes for clinicians (such as UNE-EN-ISO 13606, part2), terminologies (such as SNOMED-CT), and ontologies for knowledge representation.[29] And, of course, it's vital to fulfill the criteria of privacy and data security provided in the legislation, recently renewed in Europe with a new regulation.[30]\n\nDiscussion \nThe application of KB-DSS in healthcare can provide diverse revelations. One of the most useful can be the detection of mistakes incurred frequently by professionals when comparing to evidence-based guidelines. Other outputs can be more research-oriented, identifying situations that were thought to be good recommendations but in fact may not be, according to decisions and reasons explicitly provided by physicians while using the system.\nThe reader may have noted that we are not emphasizing from the start that the requirements of the data sets generated by our approach include being of considerable size (the \"V\" for \u201cvolume\u201d). The reason for this is that we are convinced that the data generated will eventually grow. However, there is an increasing need to prioritize the \"V\" for \u201cvalue.\u201d We think this value is closely linked to ensuring the \"V\" for \u201cveracity\u201d in big data approaches in healthcare, beyond the rest of the Vs (velocity, variety), that are certainly depending on technological capabilities and solutions. This means that we need to ensure mechanisms to guarantee the quality and completeness of the data collected[31][32] in normalized repositories if we want to have success in applying these techniques and obtaining valuable healthcare results.\n\nConclusion \nDecision support systems might be able to facilitate the autonomy of citizens when choosing their health options and the ability of professionals to make the most appropriate decision at the right moment. It may also help health policy makers and managers to prioritize the most needed actions in an environment with increasing health needs and resource constraints. But this will be very difficult without the development and maintenance of repositories of dissociated and normalized relevant clinical data from the daily clinical practice, the contributions of the patients themselves, and the fusion with open-access data of the social environment. Furthermore, this should be quickly accompanied by a proper regulation[33] (by the qualified bodies in Europe and the FDA in the U.S.) that make clearer for entrepreneurs the requirements for the development, testing, and validation of these new models.\n\nReferences \n\n\n\u2191 Institute of Medicine (1990). Field, M.J.; Lohr, K.N.. ed. Clinical Practice Guidelines: Directions for a New Program. National Academies Press. doi:10.17226\/1626.   \n\n\u2191 2.0 2.1 Moskowitz, A.; McSparron, J.; Stone, D.J.; Celi, L.A. (2015). \"Preparing a New Generation of Clinicians for the Era of Big Data\". Harvard Medical Student Review 2 (1): 24\u201327. PMC PMC4327872. PMID 25688383. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4327872 .   \n\n\u2191 Toussi, M.; Lamy, J.B.; Le Toumelin, P.; Venot, A. (2009). \"Using data mining techniques to explore physicians' therapeutic decisions when clinical guidelines do not provide recommendations: Methods and example for type 2 diabetes\". BMC Medical Informatics and Decision Making 9: 28. doi:10.1186\/1472-6947-9-28. PMC PMC2700100. PMID 19515252. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2700100 .   \n\n\u2191 Gabbay, J.; le May, A. (2004). \"Evidence based guidelines or collectively constructed \"mindlines?\": Ethnographic study of knowledge management in primary care\". BMJ 329 (7473): 1013. doi:10.1136\/bmj.329.7473.1013. PMC PMC524553. PMID 15514347. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC524553 .   \n\n\u2191 Handler, T.; Hieb, B. (13 June 2007). \"The Updated Gartner CPR Generation Criteria\" (PDF). Gartner Teleconference. Gartner. https:\/\/hiriresearch.files.wordpress.com\/2011\/04\/cpr-generational-model.pdf .   \n\n\u2191 Greenes, R.A. (2015). \"Evolution and Revolution in Knowledge-Driven Health IT: A 50-Year Perspective and a Look Ahead\". In Ria\u00f1o, D.; Lenz, R.; Miksch, S. et al.. Knowledge Representation for Health Care. Lecture Notes in Computer Science. 9485. Springer. pp. 3\u201320. doi:10.1007\/978-3-319-26585-8_1. ISBN 9783319265858.   \n\n\u2191 Wang, D.; Peleg, M.; Tu, S.W. et al. (2004). \"Design and implementation of the GLIF3 guideline execution engine\". Journal of Biomedical Informatics 37 (5): 305\u201318. doi:10.1016\/j.jbi.2004.06.002. PMID 15488745.   \n\n\u2191 Peleg, M. (2013). \"Computer-interpretable clinical guidelines: A methodological review\". Journal of Biomedical Informatics 46 (4): 744\u201363. doi:10.1016\/j.jbi.2013.06.009. PMID 23806274.   \n\n\u2191 Gonz\u00e1lez-Ferrer, A.; Peleg, M. (2015). \"Understanding requirements of clinical data standards for developing interoperable knowledge-based DSS: A case study\". Computer Standards & Interfaces 42: 125\u201336. doi:10.1016\/j.csi.2015.06.002.   \n\n\u2191 10.0 10.1 Parimbelli, E.; Sacchi, L.; Bellazzi, R. (2016). \"Decision Support through Data Integration: Strategies to Meet the Big Data Challenge\". European Journal for Biomedical Informatics 12 (1): en10\u2013en14. https:\/\/www.ejbi.org\/archive\/ejbi-volume-12-issue-1-year-2016.html .   \n\n\u2191 Mandel, J.C.; Kreda, D.A.; Mandl, K.D. et al. (2016). \"SMART on FHIR: A standards-based, interoperable apps platform for electronic health records\". JAMIA 23 (5): 899-908. doi:10.1093\/jamia\/ocv189. PMC PMC4997036. PMID 26911829. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4997036 .   \n\n\u2191 Peleg, M.; Shahar, Y.; Quaglini, S. (2013). \"Making healthcare more accessible, better, faster, and cheaper: The MobiGuide Project\". European Journal of ePractice (20): 5\u201320. https:\/\/joinup.ec.europa.eu\/sites\/default\/files\/document\/2014-06\/ePractice%20Journal-Vol.20-November%202013.pdf .   \n\n\u2191 Lenz, R.; Peleg, M.; Reichert, M. (2012). \"Healthcare Process Support: Achievements, Challenges, Current Research\". International Journal of Knowledge-Based Organizations 2 (4): i\u2013xvi. https:\/\/www.igi-global.com\/journal\/international-journal-knowledge-based-organizations\/1177 .   \n\n\u2191 van der Aalst, W.; Adriansyah, A.; Alves de Medeiros, A.K. et al. (2012). \"Process Mining Manifesto\". In Daniel, F.; Barkaoui, K.; Dustdar, S.. Business Process Management Workshops 2011. Lecture Notes in Business Information Processing. 99. Springer. doi:10.1007\/978-3-642-28108-2_19. ISBN 9783642281082.   \n\n\u2191 Mans, R.S.; van der Aalst, W.; Vanwersch, R.J.B. et al. (2013). \"Process Mining in Healthcare: Data Challenges When Answering Frequently Posed Questions\". In Lenz, R.; Miksch, S.; Peleg, M. et al.. Process Support and Knowledge Representation in Health Care. Lecture Notes in Computer Science. 7738. Springer. doi:10.1007\/978-3-642-36438-9_10. ISBN 9783642364389.   \n\n\u2191 Bellazzi, R.; Zupan, B. (2008). \"Predictive data mining in clinical medicine: Current issues and guidelines\". International Journal of Medical Informatics 77 (2): 81\u201397. doi:10.1016\/j.ijmedinf.2006.11.006. PMID 17188928.   \n\n\u2191 Bellazzi, R.; Ferrazzi, F.; Sacchi, L. (2011). \"Predictive data mining in clinical medicine: A focus on selected methods and applications\". Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 1 (5): 416\u201330. doi:10.1002\/widm.23.   \n\n\u2191 Dawson-Saunders, B.; Feltovich, P.J.; Coulson, R.L.; Steward, D.E. (1990). \"A survey of medical school teachers to identify basic biomedical concepts medical students should understand\". Academic Medicine 65 (7): 448\u201354. PMID 2242199.   \n\n\u2191 Gonz\u00e1lez-Ferrer, A.; Valc\u00e1rcel, M.; Ch\u00e1fer, J. et al. (2016). \"Diagn\u00f3stico y tratamiento de hiponatremia usando modelos computacionales de gu\u00edas de pr\u00e1ctica cl\u00ednica\". Actas del XIX Congreso Nacional de Inform\u00e1tica para la Salud, INFORSALUD 2016: 193\u2013198.   \n\n\u2191 Gonz\u00e1lez-Ferrer, A.; Valc\u00e1rcel, M.; Cuesta, M. et al. (2017). \"Development of a computer-interpretable clinical guideline model for decision support in the differential diagnosis of hyponatremia\". International Journal of Medical Informatics 103: 55\u201364. doi:10.1016\/j.ijmedinf.2017.04.014. PMID 28551002.   \n\n\u2191 Fox, J.; Johns, N.; Rahmanzadeh, A. et al. (1998). \"Disseminating medical knowledge: The PROforma approach\". Artificial Intelligence in Medicine 14 (1\u20132): 157\u201381. PMID 9779888.   \n\n\u2191 Fox, J.; Gutenstein, M.; Khan, O. et al. (2015). \"OpenClinical.net: A platform for creating and sharing knowledge and promoting best practice in healthcare\". Computers in Industry 66: 63\u201372. doi:10.1016\/j.compind.2014.10.001.   \n\n\u2191 Fenske, W.; Maier, S.K.; Blechschmidt, A. et al. (2010). \"Utility and limitations of the traditional diagnostic approach to hyponatremia: A diagnostic study\". American Journal of Medicine 123 (7): 652\u20137. doi:10.1016\/j.amjmed.2010.01.013. PMID 20609688.   \n\n\u2191 Gonz\u00e1lez-Ferrer, A.; Valc\u00e1rcel, M.; Cuesta, M. et al.. \"Comorbidities in the Syndrome of Inappropriate Antidiuretic Hormone Secretion: A Hierarchical Clustering Analysis on Discharge Data\". To be published.   \n\n\u2191 Clarke, K.R.; Somerfield, P.J.; Gorley, R.N. et al. (2008). \"Testing of null hypotheses in exploratory community analyses: Similarity profiles and biota-environment linkage\". Journal of Experimental Marine Biology and Ecology 366 (1\u20132): 56\u201369. doi:10.1016\/j.jembe.2008.07.009.   \n\n\u2191 Owen, S; Anil, R.; Dunning, T. et al. (2011). Mahout in Action. Manning Publications. pp. 416. ISBN 9781935182689.   \n\n\u2191 \"Smart Hospitals: Security and Resilience for Smart Health Service and Infrastructures\". European Union Agency for Network and Information Security. November 2016. doi:10.2824\/28801. https:\/\/www.enisa.europa.eu\/publications\/cyber-security-and-resilience-for-smart-hospitals .   \n\n\u2191 Seara, G.; Pay\u00e1, A.; Mayol, J. (2016). \"Value-based healthcare delivery in the digital era\". European Psychiatry 33 (Supplement): S33. doi:10.1016\/j.eurpsy.2016.01.862.   \n\n\u2191 Guti\u00e9rrez, A.R.; Cuenca, G.M.; Acebedo, I.A. et al. (June 2013). [http:\/\/gesdoc.isciii.es\/gesdoccontroller?action=download&id=29\/11\/2013-45c9ee530c \"Manual pr\u00e1ctico\nde interoperabilidad sem\u00e1ntica para entornos sanitarios basada en arquetipos\"] (PDF). Unidad de Investigaci\u00f3n en Telemedicina y e-Salud. pp. 152. http:\/\/gesdoc.isciii.es\/gesdoccontroller?action=download&id=29\/11\/2013-45c9ee530c .   \n\n\u2191 \"Regulation (EU) 2016\/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95\/46\/EC (General Data Protection Regulation)\". EUR-Lex. European Union. 2016. http:\/\/eur-lex.europa.eu\/legal-content\/EN\/TXT\/?uri=CELEX:32016R0679 .   \n\n\u2191 Costa-Pereira, A.; Chen, R.; Almeida, F.C. et al. (2009). \"Chapter 4: Data Quality and Integration Issues in Electronic Health Records\". In Hristidis, V.. Information Discovery on Electronic Health Records. Taylor & Francis Group. pp. 55\u201395. ISBN 9781420090413.   \n\n\u2191 Weiskopf, N.G.; Weng, C. (2013). \"Methods and dimensions of electronic health record data quality assessment: Enabling reuse for clinical research\". JAMIA 20 (1): 144\u201351. doi:10.1136\/amiajnl-2011-000681. PMC PMC3555312. PMID 22733976. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3555312 .   \n\n\u2191 Brown, S.H.; Miller, R.A. (2014). \"Chapter 26: Legal and Regulatory Issues Related to the Use of Clinical Software in Health Care Delivery\". In Greenes, R.A.. Clinical Decision Support. Academic Press. pp. 711\u2013740. doi:10.1016\/B978-0-12-398476-0.00026-9. ISBN 9780123984760.   \n\n\nNotes \nThis presentation is faithful to the original, with only a few minor changes to grammar, spelling, and presentation, including the addition of PMCID and DOI when they were missing from the original reference. The inline citation for citation 24 was misnumbered in the original text; it's corrected here.\n\n\n\n\n\n\nSource: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\">https:\/\/www.limswiki.org\/index.php\/Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare<\/a>\n\t\t\t\t\tCategories: LIMSwiki journal articles (added in 2018)LIMSwiki journal articles (all)LIMSwiki journal articles on health informatics\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\n\t\t\n\t\t\tNavigation menu\n\t\t\t\t\t\n\t\t\tViews\n\n\t\t\t\n\t\t\t\t\n\t\t\t\tJournal\n\t\t\t\tDiscussion\n\t\t\t\tView source\n\t\t\t\tHistory\n\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\t\n\t\t\t\tPersonal tools\n\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\t\t\tLog in\n\t\t\t\t\t\t\t\t\t\t\t\t\tRequest account\n\t\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\t\n\t\tNavigation\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tMain page\n\t\t\t\t\t\t\t\t\t\t\tRecent changes\n\t\t\t\t\t\t\t\t\t\t\tRandom page\n\t\t\t\t\t\t\t\t\t\t\tHelp\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tSearch\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t \n\t\t\t\t\t\t\n\t\t\t\t\n\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tTools\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tWhat links here\n\t\t\t\t\t\t\t\t\t\t\tRelated changes\n\t\t\t\t\t\t\t\t\t\t\tSpecial pages\n\t\t\t\t\t\t\t\t\t\t\tPermanent link\n\t\t\t\t\t\t\t\t\t\t\tPage information\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\tPrint\/export\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tCreate a book\n\t\t\t\t\t\t\t\t\t\t\tDownload as PDF\n\t\t\t\t\t\t\t\t\t\t\tDownload as Plain text\n\t\t\t\t\t\t\t\t\t\t\tPrintable version\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\n\t\tSponsors\n\t\t\n\t\t\t \r\n\n\t\r\n\n\t\r\n\n\t\r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n\t\n\t\r\n\n\t\r\n\n\t\r\n\n\t\r\n\t\t\n\t\t\n\t\t\t\n\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t This page was last modified on 13 March 2018, at 20:29.\n\t\t\t\t\t\t\t\t\tThis page has been accessed 342 times.\n\t\t\t\t\t\t\t\t\tContent is available under a Creative Commons Attribution-ShareAlike 4.0 International License unless otherwise noted.\n\t\t\t\t\t\t\t\t\tPrivacy policy\n\t\t\t\t\t\t\t\t\tAbout LIMSWiki\n\t\t\t\t\t\t\t\t\tDisclaimers\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\t\n\n","75f1e35ff0bfbfc1a106a26c2f646394_html":"<body class=\"mediawiki ltr sitedir-ltr ns-206 ns-subject page-Journal_Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare skin-monobook action-view\">\n<div id=\"rdp-ebb-globalWrapper\">\n\t\t<div id=\"rdp-ebb-column-content\">\n\t\t\t<div id=\"rdp-ebb-content\" class=\"mw-body\" role=\"main\">\n\t\t\t\t<a id=\"rdp-ebb-top\"><\/a>\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t<h1 id=\"rdp-ebb-firstHeading\" class=\"firstHeading\" lang=\"en\">Journal:Generating big data sets from knowledge-based decision support systems to pursue value-based healthcare<\/h1>\n\t\t\t\t\n\t\t\t\t<div id=\"rdp-ebb-bodyContent\" class=\"mw-body-content\">\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\n\t\t\t\t\t<!-- start content -->\n\t\t\t\t\t<div id=\"rdp-ebb-mw-content-text\" lang=\"en\" dir=\"ltr\" class=\"mw-content-ltr\">\n\n\n<h2><span class=\"mw-headline\" id=\"Abstract\">Abstract<\/span><\/h2>\n<p>Talking about big data in healthcare we usually refer to how to use data collected from current <a href=\"https:\/\/www.limswiki.org\/index.php\/Electronic_medical_record\" title=\"Electronic medical record\" target=\"_blank\" class=\"wiki-link\" data-key=\"99a695d2af23397807da0537d29d0be7\">electronic medical records<\/a>, either structured or unstructured, to answer clinically relevant questions. This operation is typically carried out by means of analytics tools (e.g., machine learning) or by extracting relevant data from patient summaries through natural language processing techniques. From other perspectives of research in <a href=\"https:\/\/www.limswiki.org\/index.php\/Medical_informatics\" title=\"Medical informatics\" class=\"mw-redirect wiki-link\" target=\"_blank\" data-key=\"f89ecb3b26617b8c6e09bc5e050cfd5d\">medical informatics<\/a>, powerful initiatives have emerged to help physicians make decisions, in both diagnostics and therapeutics, built from existing medical evidence (i.e., knowledge-based <a href=\"https:\/\/www.limswiki.org\/index.php\/Clinical_decision_support_system\" title=\"Clinical decision support system\" target=\"_blank\" class=\"wiki-link\" data-key=\"095141425468d057aa977016869ca37d\">decision support systems<\/a>). Many of the problems these tools have shown, when used in real clinical settings, are related to their implementation and deployment, more than failing in their support; however, technology is slowly overcoming interoperability and integration issues. Beyond the point-of-care decision support these tools can provide, the data generated when using them, even in controlled trials, could be used to further analyze facts that are traditionally ignored in the current clinical practice. In this paper, we reflect on the technologies available to make the leap and how they could help drive healthcare organizations shifting to a value-based healthcare philosophy.\n<\/p><p><b>Keywords<\/b>: big data, DSS, e-health, knowledge management, management systems\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Introduction\">Introduction<\/span><\/h2>\n<p>Healthcare made a big step towards modernization with the emergence of the evidence-based medicine (EBM) concept in the late 1980s.<sup id=\"rdp-ebb-cite_ref-FieldClinical90_1-0\" class=\"reference\"><a href=\"#cite_note-FieldClinical90-1\" rel=\"external_link\">[1]<\/a><\/sup> EBM is an approach to medical practice that aims to apply the best known scientific evidence into clinical decision-making regarding diagnosis and effective management of specific conditions and diseases. While the EBM concept was generally well received by care professionals, many factors \u2014 such as their daily work conditions or their high work load \u2014 affect putting into practice this approach in the expected way. A recent report from the Institute of Medicine in 2012 revealed that only 10 to 20 percent of the decisions clinicians make are evidence-based.<sup id=\"rdp-ebb-cite_ref-MoskowitzPrep15_2-0\" class=\"reference\"><a href=\"#cite_note-MoskowitzPrep15-2\" rel=\"external_link\">[2]<\/a><\/sup> This fact reflects the need for medical practitioners, supported by their healthcare organizations, to make a shift in their behavior about the way clinical practice is currently carried out.\n<\/p><p>The idea of EBM emerged in very different conditions from the current scenario. An explosion of technical possibilities \u2014 in nearly thirty years \u2014 have come into place to help organizations take a more modern approach, providing them with support in this regard. Not only epidemiological research can drive EBM, but also new data-oriented approaches. When saying \u201cdata-oriented,\u201d we refer to data about the real daily clinical practice: how, when, why, and by whom are clinical actions carried out (or not), and what are the health results of those actions. Nonetheless, this might still be hampered by the current design of electronic medical records (EMRs) and by the role and focus that contemporary doctors should adopt. The use of EMRs by physicians could be insufficient, as recognized by studies<sup id=\"rdp-ebb-cite_ref-MoskowitzPrep15_2-1\" class=\"reference\"><a href=\"#cite_note-MoskowitzPrep15-2\" rel=\"external_link\">[2]<\/a><\/sup> that expose that, even after post-digitalization of healthcare, they are not utilized to their maximum potential at all.\n<\/p><p>The fact that the EBM approach was crafted with the goal in mind of pursuing effectiveness in disease management left behind the consideration of organizational and human factors that are crucial in how decisions are truly made. By analyzing data generated by healthcare organizations we could yield information concerning the pitfalls that are hindering evidence-based clinical actions. At the same time, new evidence could be unveiled that is probably not considered in the current production of clinical practice guidelines (CPGs). For example, Toussi <i>et al.<\/i><sup id=\"rdp-ebb-cite_ref-ToussiUsing09_3-0\" class=\"reference\"><a href=\"#cite_note-ToussiUsing09-3\" rel=\"external_link\">[3]<\/a><\/sup> used data mining techniques to find out how physicians prescribe medications in diverse cases with various clinical conditions, in order to complement existing clinical guidelines where absence of enough evidence occur. Furthermore, specific training actions could be directed to address common failures detected in the management of medical conditions.\n<\/p><p>Therefore, the problem that healthcare organizations are trying to solve, under the hypothesis that the \u201cbig data\u201d paradigm will change the way clinical practice is currently carried out, is how they can produce data that help to unveil real clinical behavior and \"mindlines,\"<sup id=\"rdp-ebb-cite_ref-GabbayEvidence04_4-0\" class=\"reference\"><a href=\"#cite_note-GabbayEvidence04-4\" rel=\"external_link\">[4]<\/a><\/sup> linked with other organizational data (e.g., costs) and context information that could be behind their actions and decisions. Only making this analysis possible will they be able to change their philosophy to pursue and underpin value, beyond so-called effectiveness. And value here means detecting which actions \u2014 later possibly abstracted into policies \u2014 could really improve the behavior of the organizations and care professionals for the better care of their patients. \n<\/p><p>In this paper, we intend to reflect on some existing techniques beyond current electronic medical records (EMRs) that can help to generate such data sets, as well as considerations to be made, providing some examples of initiatives we are trying to push forward from the Innovation Unit of Hospital Universitario Cl\u00ednico San Carlos (HUCSC).\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Knowledge-based_decision_support_systems_.28KB-DSS.29\">Knowledge-based decision support systems (KB-DSS)<\/span><\/h2>\n<p>In 2007, Gartner<sup id=\"rdp-ebb-cite_ref-HandlerTheUpd07_5-0\" class=\"reference\"><a href=\"#cite_note-HandlerTheUpd07-5\" rel=\"external_link\">[5]<\/a><\/sup> reported a five-stage evolution model for <a href=\"https:\/\/www.limswiki.org\/index.php\/Electronic_health_record\" title=\"Electronic health record\" target=\"_blank\" class=\"wiki-link\" data-key=\"f2e31a73217185bb01389404c1fd5255\">electronic health records<\/a> (EHRs) where they established a path of characteristics, in terms of eight core capabilities, that EHRs should follow in order to provide the proper support to care professionals. Systems complying with Generation 3 requisites were supposed to be able to bring evidence-based medicine to the point of care, and theoretically coincide with the capabilities of most available EHRs, which progressed mainly through the core capabilities of system management, interoperability, and clinical data models, though there is still space for improvement today. Generation 4 was expected to improve the core capabilities of decision support, clinically relevant data analysis, presentation, and clinical <a href=\"https:\/\/www.limswiki.org\/index.php\/Workflow\" title=\"Workflow\" target=\"_blank\" class=\"wiki-link\" data-key=\"92bd8748272e20d891008dcb8243e8a8\">workflow<\/a> management.\n<\/p><p>More recently, Greenes offered his view about the past and future of knowledge-driven health IT<sup id=\"rdp-ebb-cite_ref-GreenesEvo15_6-0\" class=\"reference\"><a href=\"#cite_note-GreenesEvo15-6\" rel=\"external_link\">[6]<\/a><\/sup>, stating that current EHR systems were built for a model that is now old and even inappropriate, supported by proprietary infrastructures and knowledge content. He also mentioned the gradual increase in knowledge-based applications during the 2000s, with the creation of computer-interpretable clinical guideline formalisms like GLIF<sup id=\"rdp-ebb-cite_ref-WangDesign04_7-0\" class=\"reference\"><a href=\"#cite_note-WangDesign04-7\" rel=\"external_link\">[7]<\/a><\/sup> and others.<sup id=\"rdp-ebb-cite_ref-PelegComp13_8-0\" class=\"reference\"><a href=\"#cite_note-PelegComp13-8\" rel=\"external_link\">[8]<\/a><\/sup> By that time, these systems were having little penetration into real clinical settings, mostly due to the lack of pervasiveness of standards and the use of proprietary tools. Fortunately, this fact is something widely recognized by the current health IT community, and steps have been taken to tackle these problems. From requirements analysis of data standards<sup id=\"rdp-ebb-cite_ref-Gonz.C3.A1lez-FerrerUnder15_9-0\" class=\"reference\"><a href=\"#cite_note-Gonz.C3.A1lez-FerrerUnder15-9\" rel=\"external_link\">[9]<\/a><\/sup> and development data integration mechanisms<sup id=\"rdp-ebb-cite_ref-ParimbelliDecision16_10-0\" class=\"reference\"><a href=\"#cite_note-ParimbelliDecision16-10\" rel=\"external_link\">[10]<\/a><\/sup> for making DSS interoperable, the emergence of new lightweight web services standards like the <a href=\"https:\/\/www.limswiki.org\/index.php\/HL7\" title=\"HL7\" class=\"mw-redirect wiki-link\" target=\"_blank\" data-key=\"944ec30acac5b7c05ef9ce3c1b4c22dc\">HL7<\/a> Fast Healthcare Interoperability Resources (FHIR)<sup id=\"rdp-ebb-cite_ref-MandelSMART16_11-0\" class=\"reference\"><a href=\"#cite_note-MandelSMART16-11\" rel=\"external_link\">[11]<\/a><\/sup>, to substantial investments from public bodies that ended up with real deployments and piloting of patient guidance systems. A good example is MobiGuide<sup id=\"rdp-ebb-cite_ref-ParimbelliDecision16_10-1\" class=\"reference\"><a href=\"#cite_note-ParimbelliDecision16-10\" rel=\"external_link\">[10]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-PelegMaking13_12-0\" class=\"reference\"><a href=\"#cite_note-PelegMaking13-12\" rel=\"external_link\">[12]<\/a><\/sup>, a project funded by the European Commission under the seventh framework program (FP7). Its goal was to create an intelligent KB-DSS to help physicians and patients taking the most appropriate decisions to manage concrete conditions (atrial fibrillation, gestational diabetes) using a back-end server and wearable sensors to monitor patients' status.\n<\/p><p>In this context, Figure 1 represents the architecture that represents our view, well aligned to positions already expressed by some research communities.<sup id=\"rdp-ebb-cite_ref-LenzHealth12_13-0\" class=\"reference\"><a href=\"#cite_note-LenzHealth12-13\" rel=\"external_link\">[13]<\/a><\/sup> From top to bottom and left to right, physicians and epidemiologists develop CPGs that can be computerized, together with knowledge engineers, into computer-interpretable guideline (CIG) models. With the proper validation mechanisms, using data previously aggregated into clinical data repositories, these models can be trialed, after the corresponding integration into <a href=\"https:\/\/www.limswiki.org\/index.php\/Hospital_information_system\" title=\"Hospital information system\" target=\"_blank\" class=\"wiki-link\" data-key=\"d8385de7b1f39a39d793f8ce349b448d\">hospital information systems<\/a>. The execution of CIG models can start generating data sets that are composed of acceptance or denials by physicians of recommendations (e.g., diagnosis, drug prescriptions, therapies, etc.) provided by the knowledge-based DSS developed, and treatments paths followed for different patient profiles. These paths can later on be analyzed by means of process mining techniques<sup id=\"rdp-ebb-cite_ref-AalstProcess12_14-0\" class=\"reference\"><a href=\"#cite_note-AalstProcess12-14\" rel=\"external_link\">[14]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-MansProcess13_15-0\" class=\"reference\"><a href=\"#cite_note-MansProcess13-15\" rel=\"external_link\">[15]<\/a><\/sup>, unveiling common practices followed while using decision support and comparing the compliance of traditional clinical practice with the one recommended by the evidence-based DSS. At the same time, normalized clinical data repositories, while ensuring the quality of the data stored, can be used in the traditional view of machine learning and big data research.<sup id=\"rdp-ebb-cite_ref-BellazziPred08_16-0\" class=\"reference\"><a href=\"#cite_note-BellazziPred08-16\" rel=\"external_link\">[16]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-BellazziPredict11_17-0\" class=\"reference\"><a href=\"#cite_note-BellazziPredict11-17\" rel=\"external_link\">[17]<\/a><\/sup> The results could be complemented by comparing them with the output data sets of the KB-DSS. The output of the research could provide new evidence to be included in new versions of the CPGs (continuous improvement).\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig1_Gonz%C3%A1lez-FerrerIJIMAI2018_4-7.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"b2bebd1af9a63fd0e3ed1d9f87df6c8d\"><img alt=\"Fig1 Gonz\u00e1lez-FerrerIJIMAI2018 4-7.png\" src=\"https:\/\/www.limswiki.org\/images\/5\/5f\/Fig1_Gonz%C3%A1lez-FerrerIJIMAI2018_4-7.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 1:<\/b> Architecture for the generation of evidence-based big data sets<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h2><span class=\"mw-headline\" id=\"Innovative_projects_in_HUCSC\">Innovative projects in HUCSC<\/span><\/h2>\n<p>The Innovation Unit of Hospital Universitario Cl\u00ednico San Carlos, being transversal to the healthcare institution, is intended to cover two main aspects of innovation, always pursuing to increase value. On the one hand, it is expected to help <a href=\"https:\/\/www.limswiki.org\/index.php\/Hospital\" title=\"Hospital\" target=\"_blank\" class=\"wiki-link\" data-key=\"b8f070c66d8123fe91063594befebdff\">hospital<\/a> professionals to get their research into the mainstream, when there is an opportunity for it. On the other hand, it maintains a technical department to develop innovative products and test their prototypes, driving the hospital to maximize the possibilities that technological solutions could provide, especially artificial intelligence-based tools. \n<\/p><p>The ultimate intention is to disseminate the existence of these techniques while facilitating its understanding, create a culture of innovation within the hospital, and, when possible, get external companies to finalize these prototypes (or collaborate in the development) if they are demonstrated relevant and close to market possibility. The following are several ongoing projects aligned with these goals and that contribute several methods and artifacts to the architecture presented.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Computer-interpretable_guideline_for_diagnosis_and_treatment_of_hyponatremia\">Computer-interpretable guideline for diagnosis and treatment of hyponatremia<\/span><\/h3>\n<p>The endocrinology department demanded a process-based solution to help new residents to improve their ability to diagnose and manage the hyponatremia condition (presenting low levels of serum sodium). Hyponatremia is the most frequent electrolyte disorder; however, according to some studies, it has proved to be very difficult to comprehend by physicians in general.<sup id=\"rdp-ebb-cite_ref-Dawson-SaundersASurv90_18-0\" class=\"reference\"><a href=\"#cite_note-Dawson-SaundersASurv90-18\" rel=\"external_link\">[18]<\/a><\/sup> To address this project, we developed a CIG model<sup id=\"rdp-ebb-cite_ref-Gonz.C3.A1lez-FerrerDiag16_19-0\" class=\"reference\"><a href=\"#cite_note-Gonz.C3.A1lez-FerrerDiag16-19\" rel=\"external_link\">[19]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-Gonz.C3.A1lez-FerrerDev17_20-0\" class=\"reference\"><a href=\"#cite_note-Gonz.C3.A1lez-FerrerDev17-20\" rel=\"external_link\">[20]<\/a><\/sup> using the PROForma set of tools<sup id=\"rdp-ebb-cite_ref-FoxDissem98_21-0\" class=\"reference\"><a href=\"#cite_note-FoxDissem98-21\" rel=\"external_link\">[21]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-FoxOpen15_22-0\" class=\"reference\"><a href=\"#cite_note-FoxOpen15-22\" rel=\"external_link\">[22]<\/a><\/sup>, covering the diagnosis of hyponatremia, classifying it into thirteen different subtypes. During a retrospective validation of the system with the data from 65 patients, we compared the system\u2019s output to the diagnosis consensus of two experts, obtaining a very high agreement (kappa=0.86). The agreement found was also higher than a previous experiment found in the literature<sup id=\"rdp-ebb-cite_ref-FenskeUtility10_23-0\" class=\"reference\"><a href=\"#cite_note-FenskeUtility10-23\" rel=\"external_link\">[23]<\/a><\/sup>, carried out by comparing the performance of a resident physician \u2014 using the original paper guideline \u2014 with the diagnosis of senior physicians. Nonetheless, the most relevant advance of using such a system, beyond its successful diagnosis performance, was the identification and recording of data cases that were contrary to the consensus of international hyponatremia experts, specifically regarding hypoaldosteronism, where concrete marker thresholds were thought to be associated to its diagnosis. The application of our model found several cases where this hypothesis did not apply, showing the lack of real evidence and the need for further research. This is a concrete demonstration of how putting into practice these knowledge-based systems can help detect where evidence is failing and focus new research directions.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Unsupervised_learning_of_discharge_data_.28big_data.29\">Unsupervised learning of discharge data (big data)<\/span><\/h3>\n<p>The syndrome of inappropriate antidiuretic hormone secretion (SIADH) represents around one-third of all cases of hyponatremia. We carried out a project<sup id=\"rdp-ebb-cite_ref-Gonz.C3.A1lez-FerrerComorb_24-0\" class=\"reference\"><a href=\"#cite_note-Gonz.C3.A1lez-FerrerComorb-24\" rel=\"external_link\">[24]<\/a><\/sup> to identify clusters of hospitalized SIADH patients sharing diagnosed pathologies (comorbidities), where the results coincided and extended previous research identifying individual comorbidities.\n<\/p><p>Our methods included testing of two different distance measures and hierarchical agglomerative clustering. We used similarity profile analysis for determination of the number of significant clusters and membership of individuals<sup id=\"rdp-ebb-cite_ref-ClarkeTesting08_25-0\" class=\"reference\"><a href=\"#cite_note-ClarkeTesting08-25\" rel=\"external_link\">[25]<\/a><\/sup> (by means of the SIMPROF method included in the clustsig R package). The method provides also the members of each proposed cluster, where validation of the clusters produced is assessed by iteratively carrying out hundreds of permutations tests. Analyzing the data from around 650 patients, it unveiled eight clusters, with five of them being significant: cancer patients, urinary tract infection patients, patients with renal failure, patients with respiratory problems, and patients with atrial fibrillation and other heart conditions.\n<\/p><p>We found one main problem: this process is costly to carry out on a personal computer, especially when having thousands of columns (variables) in the data. We are evaluating the use of the Cloudera big data framework along with Apache Mahout<sup id=\"rdp-ebb-cite_ref-OwenMahout11_26-0\" class=\"reference\"><a href=\"#cite_note-OwenMahout11-26\" rel=\"external_link\">[26]<\/a><\/sup> to build the next stage of scalable algorithms that are able to cope with big data sets. If successful, this should be accompanied by the deployment of a private cloud infrastructure<sup id=\"rdp-ebb-cite_ref-ENISASmart16_27-0\" class=\"reference\"><a href=\"#cite_note-ENISASmart16-27\" rel=\"external_link\">[27]<\/a><\/sup> able to provide a machine learning as a service (MLaaS) platform, due to the characteristics of patient sensitive data.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Hikari:_A_case_study_of_mental_health_.28big_data.29\">Hikari: A case study of mental health (big data)<\/span><\/h3>\n<p>In June 2015, Fujitsu Laboratories of Europe Ltd. and Fujitsu EMEIA in Spain signed a strategic research collaboration agreement with the Foundation for Biomedical Research of Hospital Cl\u00ednico San Carlos (FIBHCSC). Mental health was selected as a key target for the initial project for several reasons: 1) the high levels of disability and morbidity associated to mental illness; 2) the important burden that mental illness imposes on patients, both at the individual and social level, and on the use of healthcare resources; and 3) the virtual impossibility to analyze results and its value, despite an apparently perfect design and theoretical structure of mental health services.<sup id=\"rdp-ebb-cite_ref-SearaValue16_28-0\" class=\"reference\"><a href=\"#cite_note-SearaValue16-28\" rel=\"external_link\">[28]<\/a><\/sup>\n<\/p><p>Hikari, the Japanese word for light, is a part of Fujitsu\u2019s Zinrai Artificial Intelligence technologies focused on people that includes data analytics and semantic modeling. In this project we have used relevant dissociated clinical data from the psychiatric department, obtained during the last 10 years, including patient discharge records and the specific registries of psychiatric emergency care, in order to generate a very simple and friendly tool that allows clinicians to have access to <a href=\"https:\/\/www.limswiki.org\/index.php\/Information\" title=\"Information\" target=\"_blank\" class=\"wiki-link\" data-key=\"6300a14d9c2776dcca0999b5ed940e7d\">information<\/a> related to the main diagnosis, comorbidities, and associated health risks, and also the possibility of analysis at the population level. It has been also useful to track the pathways through the healthcare system followed by patients, and to analyze the impact on the use of resources and costs. \n<\/p><p>At the present time, the database includes approximately 30,000 emergency care records and 6,500 hospitalizations; however, we expect that by the time this paper will be published, it will include data from more than 370,000 outpatients and 38,000 records of day hospital care. This will help us to establish patterns of behavior of the different pathologies and conditions in terms of comorbidities, pathways, and use of resources.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Clinical_data_repository_for_secondary_use\">Clinical data repository for secondary use<\/span><\/h3>\n<p>Health observatories, regardless of regional, national, or supranational level, rely on reporting data that will inform on healthcare structure and compliance with programs or pathways. However, data on health outcomes and results are very few or close to none. This is very closely related to the incoherence and fragmented evolution of health care information systems.\n<\/p><p>In the last decades it has become increasingly evident the demographic and social change in Western societies that has brought the concepts of chronicity, fragility, and complexity of patients. This makes the continuity of care centered on patients an absolute necessity if we are to keep our health systems sustainable. Probably one of the main factors involved in this kind of transformation is access to daily care data that will enable patients, professionals, managers, and health policy makers to address these challenges.\n<\/p><p>If we consider the previous lines, it becomes more and more evident the desirability of having repositories of relevant dissociated clinical data that will allow the evaluation of the procedures and results of the real clinical practice, to compare them with recommendations based on evidence, and, at the same time, to generate new evidence from the stored data. It is essential to standardize data structure, context (actors, themes, time), continuity of care (such as UNE-EN-13940), generic reference models (such as UNE-EN-ISO 13606, part1), understandable archetypes for clinicians (such as UNE-EN-ISO 13606, part2), terminologies (such as SNOMED-CT), and ontologies for knowledge representation.<sup id=\"rdp-ebb-cite_ref-Guti.C3.A9rrezManual13_29-0\" class=\"reference\"><a href=\"#cite_note-Guti.C3.A9rrezManual13-29\" rel=\"external_link\">[29]<\/a><\/sup> And, of course, it's vital to fulfill the criteria of privacy and data security provided in the legislation, recently renewed in Europe with a new regulation.<sup id=\"rdp-ebb-cite_ref-EURegulation16_30-0\" class=\"reference\"><a href=\"#cite_note-EURegulation16-30\" rel=\"external_link\">[30]<\/a><\/sup>\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Discussion\">Discussion<\/span><\/h2>\n<p>The application of KB-DSS in healthcare can provide diverse revelations. One of the most useful can be the detection of mistakes incurred frequently by professionals when comparing to evidence-based guidelines. Other outputs can be more research-oriented, identifying situations that were thought to be good recommendations but in fact may not be, according to decisions and reasons explicitly provided by physicians while using the system.\n<\/p><p>The reader may have noted that we are not emphasizing from the start that the requirements of the data sets generated by our approach include being of considerable size (the \"V\" for \u201cvolume\u201d). The reason for this is that we are convinced that the data generated will eventually grow. However, there is an increasing need to prioritize the \"V\" for \u201cvalue.\u201d We think this value is closely linked to ensuring the \"V\" for \u201cveracity\u201d in big data approaches in healthcare, beyond the rest of the Vs (velocity, variety), that are certainly depending on technological capabilities and solutions. This means that we need to ensure mechanisms to guarantee the quality and completeness of the data collected<sup id=\"rdp-ebb-cite_ref-Costa-PereiraInform09_31-0\" class=\"reference\"><a href=\"#cite_note-Costa-PereiraInform09-31\" rel=\"external_link\">[31]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-WeiskopfMeth13_32-0\" class=\"reference\"><a href=\"#cite_note-WeiskopfMeth13-32\" rel=\"external_link\">[32]<\/a><\/sup> in normalized repositories if we want to have success in applying these techniques and obtaining valuable healthcare results.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Conclusion\">Conclusion<\/span><\/h2>\n<p>Decision support systems might be able to facilitate the autonomy of citizens when choosing their health options and the ability of professionals to make the most appropriate decision at the right moment. It may also help health policy makers and managers to prioritize the most needed actions in an environment with increasing health needs and resource constraints. But this will be very difficult without the development and maintenance of repositories of dissociated and normalized relevant clinical data from the daily clinical practice, the contributions of the patients themselves, and the fusion with open-access data of the social environment. Furthermore, this should be quickly accompanied by a proper regulation<sup id=\"rdp-ebb-cite_ref-BrownLegal14_33-0\" class=\"reference\"><a href=\"#cite_note-BrownLegal14-33\" rel=\"external_link\">[33]<\/a><\/sup> (by the qualified bodies in Europe and the FDA in the U.S.) that make clearer for entrepreneurs the requirements for the development, testing, and validation of these new models.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"References\">References<\/span><\/h2>\n<div class=\"reflist references-column-width\" style=\"-moz-column-width: 30em; -webkit-column-width: 30em; column-width: 30em; list-style-type: decimal;\">\n<ol class=\"references\">\n<li id=\"cite_note-FieldClinical90-1\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-FieldClinical90_1-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Institute of Medicine (1990). Field, M.J.; Lohr, K.N.. ed. <i>Clinical Practice Guidelines: Directions for a New Program<\/i>. National Academies Press. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.17226%2F1626\" target=\"_blank\">10.17226\/1626<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=book&rft.btitle=Clinical+Practice+Guidelines%3A+Directions+for+a+New+Program&rft.aulast=Institute+of+Medicine&rft.au=Institute+of+Medicine&rft.date=1990&rft.pub=National+Academies+Press&rft_id=info:doi\/10.17226%2F1626&rfr_id=info:sid\/en.wikipedia.org:Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MoskowitzPrep15-2\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-MoskowitzPrep15_2-0\" rel=\"external_link\">2.0<\/a><\/sup> <sup><a href=\"#cite_ref-MoskowitzPrep15_2-1\" rel=\"external_link\">2.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Moskowitz, A.; McSparron, J.; Stone, D.J.; Celi, L.A. (2015). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4327872\" target=\"_blank\">\"Preparing a New Generation of Clinicians for the Era of Big Data\"<\/a>. <i>Harvard Medical Student Review<\/i> <b>2<\/b> (1): 24\u201327. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4327872\/\" target=\"_blank\">PMC4327872<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/25688383\" target=\"_blank\">25688383<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4327872\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4327872<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Preparing+a+New+Generation+of+Clinicians+for+the+Era+of+Big+Data&rft.jtitle=Harvard+Medical+Student+Review&rft.aulast=Moskowitz%2C+A.%3B+McSparron%2C+J.%3B+Stone%2C+D.J.%3B+Celi%2C+L.A.&rft.au=Moskowitz%2C+A.%3B+McSparron%2C+J.%3B+Stone%2C+D.J.%3B+Celi%2C+L.A.&rft.date=2015&rft.volume=2&rft.issue=1&rft.pages=24%E2%80%9327&rft_id=info:pmc\/PMC4327872&rft_id=info:pmid\/25688383&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4327872&rfr_id=info:sid\/en.wikipedia.org:Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ToussiUsing09-3\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ToussiUsing09_3-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Toussi, M.; Lamy, J.B.; Le Toumelin, P.; Venot, A. (2009). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2700100\" target=\"_blank\">\"Using data mining techniques to explore physicians' therapeutic decisions when clinical guidelines do not provide recommendations: Methods and example for type 2 diabetes\"<\/a>. <i>BMC Medical Informatics and Decision Making<\/i> <b>9<\/b>: 28. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1186%2F1472-6947-9-28\" target=\"_blank\">10.1186\/1472-6947-9-28<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC2700100\/\" target=\"_blank\">PMC2700100<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/19515252\" target=\"_blank\">19515252<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2700100\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2700100<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Using+data+mining+techniques+to+explore+physicians%27+therapeutic+decisions+when+clinical+guidelines+do+not+provide+recommendations%3A+Methods+and+example+for+type+2+diabetes&rft.jtitle=BMC+Medical+Informatics+and+Decision+Making&rft.aulast=Toussi%2C+M.%3B+Lamy%2C+J.B.%3B+Le+Toumelin%2C+P.%3B+Venot%2C+A.&rft.au=Toussi%2C+M.%3B+Lamy%2C+J.B.%3B+Le+Toumelin%2C+P.%3B+Venot%2C+A.&rft.date=2009&rft.volume=9&rft.pages=28&rft_id=info:doi\/10.1186%2F1472-6947-9-28&rft_id=info:pmc\/PMC2700100&rft_id=info:pmid\/19515252&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC2700100&rfr_id=info:sid\/en.wikipedia.org:Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GabbayEvidence04-4\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-GabbayEvidence04_4-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Gabbay, J.; le May, A. (2004). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC524553\" target=\"_blank\">\"Evidence based guidelines or collectively constructed \"mindlines?\": Ethnographic study of knowledge management in primary care\"<\/a>. <i>BMJ<\/i> <b>329<\/b> (7473): 1013. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1136%2Fbmj.329.7473.1013\" target=\"_blank\">10.1136\/bmj.329.7473.1013<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC524553\/\" target=\"_blank\">PMC524553<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/15514347\" target=\"_blank\">15514347<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC524553\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC524553<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Evidence+based+guidelines+or+collectively+constructed+%22mindlines%3F%22%3A+Ethnographic+study+of+knowledge+management+in+primary+care&rft.jtitle=BMJ&rft.aulast=Gabbay%2C+J.%3B+le+May%2C+A.&rft.au=Gabbay%2C+J.%3B+le+May%2C+A.&rft.date=2004&rft.volume=329&rft.issue=7473&rft.pages=1013&rft_id=info:doi\/10.1136%2Fbmj.329.7473.1013&rft_id=info:pmc\/PMC524553&rft_id=info:pmid\/15514347&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC524553&rfr_id=info:sid\/en.wikipedia.org:Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HandlerTheUpd07-5\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HandlerTheUpd07_5-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Handler, T.; Hieb, B. (13 June 2007). <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/hiriresearch.files.wordpress.com\/2011\/04\/cpr-generational-model.pdf\" target=\"_blank\">\"The Updated Gartner CPR Generation Criteria\"<\/a> (PDF). <i>Gartner Teleconference<\/i>. Gartner<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/hiriresearch.files.wordpress.com\/2011\/04\/cpr-generational-model.pdf\" target=\"_blank\">https:\/\/hiriresearch.files.wordpress.com\/2011\/04\/cpr-generational-model.pdf<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=The+Updated+Gartner+CPR+Generation+Criteria&rft.atitle=Gartner+Teleconference&rft.aulast=Handler%2C+T.%3B+Hieb%2C+B.&rft.au=Handler%2C+T.%3B+Hieb%2C+B.&rft.date=13+June+2007&rft.pub=Gartner&rft_id=https%3A%2F%2Fhiriresearch.files.wordpress.com%2F2011%2F04%2Fcpr-generational-model.pdf&rfr_id=info:sid\/en.wikipedia.org:Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GreenesEvo15-6\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-GreenesEvo15_6-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Greenes, R.A. (2015). \"Evolution and Revolution in Knowledge-Driven Health IT: A 50-Year Perspective and a Look Ahead\". In Ria\u00f1o, D.; Lenz, R.; Miksch, S. et al.. <i>Knowledge Representation for Health Care<\/i>. Lecture Notes in Computer Science. <b>9485<\/b>. Springer. pp. 3\u201320. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1007%2F978-3-319-26585-8_1\" target=\"_blank\">10.1007\/978-3-319-26585-8_1<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a> 9783319265858.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Evolution+and+Revolution+in+Knowledge-Driven+Health+IT%3A+A+50-Year+Perspective+and+a+Look+Ahead&rft.atitle=Knowledge+Representation+for+Health+Care&rft.aulast=Greenes%2C+R.A.&rft.au=Greenes%2C+R.A.&rft.date=2015&rft.series=Lecture+Notes+in+Computer+Science&rft.volume=9485&rft.pages=pp.%26nbsp%3B3%E2%80%9320&rft.pub=Springer&rft_id=info:doi\/10.1007%2F978-3-319-26585-8_1&rft.isbn=9783319265858&rfr_id=info:sid\/en.wikipedia.org:Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WangDesign04-7\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WangDesign04_7-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Wang, D.; Peleg, M.; Tu, S.W. et al. (2004). \"Design and implementation of the GLIF3 guideline execution engine\". <i>Journal of Biomedical Informatics<\/i> <b>37<\/b> (5): 305\u201318. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.jbi.2004.06.002\" target=\"_blank\">10.1016\/j.jbi.2004.06.002<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/15488745\" target=\"_blank\">15488745<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Design+and+implementation+of+the+GLIF3+guideline+execution+engine&rft.jtitle=Journal+of+Biomedical+Informatics&rft.aulast=Wang%2C+D.%3B+Peleg%2C+M.%3B+Tu%2C+S.W.+et+al.&rft.au=Wang%2C+D.%3B+Peleg%2C+M.%3B+Tu%2C+S.W.+et+al.&rft.date=2004&rft.volume=37&rft.issue=5&rft.pages=305%E2%80%9318&rft_id=info:doi\/10.1016%2Fj.jbi.2004.06.002&rft_id=info:pmid\/15488745&rfr_id=info:sid\/en.wikipedia.org:Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-PelegComp13-8\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-PelegComp13_8-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Peleg, M. (2013). \"Computer-interpretable clinical guidelines: A methodological review\". <i>Journal of Biomedical Informatics<\/i> <b>46<\/b> (4): 744\u201363. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.jbi.2013.06.009\" target=\"_blank\">10.1016\/j.jbi.2013.06.009<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/23806274\" target=\"_blank\">23806274<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Computer-interpretable+clinical+guidelines%3A+A+methodological+review&rft.jtitle=Journal+of+Biomedical+Informatics&rft.aulast=Peleg%2C+M.&rft.au=Peleg%2C+M.&rft.date=2013&rft.volume=46&rft.issue=4&rft.pages=744%E2%80%9363&rft_id=info:doi\/10.1016%2Fj.jbi.2013.06.009&rft_id=info:pmid\/23806274&rfr_id=info:sid\/en.wikipedia.org:Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-Gonz.C3.A1lez-FerrerUnder15-9\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-Gonz.C3.A1lez-FerrerUnder15_9-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Gonz\u00e1lez-Ferrer, A.; Peleg, M. (2015). \"Understanding requirements of clinical data standards for developing interoperable knowledge-based DSS: A case study\". <i>Computer Standards & Interfaces<\/i> <b>42<\/b>: 125\u201336. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.csi.2015.06.002\" target=\"_blank\">10.1016\/j.csi.2015.06.002<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Understanding+requirements+of+clinical+data+standards+for+developing+interoperable+knowledge-based+DSS%3A+A+case+study&rft.jtitle=Computer+Standards+%26+Interfaces&rft.aulast=Gonz%C3%A1lez-Ferrer%2C+A.%3B+Peleg%2C+M.&rft.au=Gonz%C3%A1lez-Ferrer%2C+A.%3B+Peleg%2C+M.&rft.date=2015&rft.volume=42&rft.pages=125%E2%80%9336&rft_id=info:doi\/10.1016%2Fj.csi.2015.06.002&rfr_id=info:sid\/en.wikipedia.org:Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ParimbelliDecision16-10\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-ParimbelliDecision16_10-0\" rel=\"external_link\">10.0<\/a><\/sup> <sup><a href=\"#cite_ref-ParimbelliDecision16_10-1\" rel=\"external_link\">10.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Parimbelli, E.; Sacchi, L.; Bellazzi, R. (2016). <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.ejbi.org\/archive\/ejbi-volume-12-issue-1-year-2016.html\" target=\"_blank\">\"Decision Support through Data Integration: Strategies to Meet the Big Data Challenge\"<\/a>. <i>European Journal for Biomedical Informatics<\/i> <b>12<\/b> (1): en10\u2013en14<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.ejbi.org\/archive\/ejbi-volume-12-issue-1-year-2016.html\" target=\"_blank\">https:\/\/www.ejbi.org\/archive\/ejbi-volume-12-issue-1-year-2016.html<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Decision+Support+through+Data+Integration%3A+Strategies+to+Meet+the+Big+Data+Challenge&rft.jtitle=European+Journal+for+Biomedical+Informatics&rft.aulast=Parimbelli%2C+E.%3B+Sacchi%2C+L.%3B+Bellazzi%2C+R.&rft.au=Parimbelli%2C+E.%3B+Sacchi%2C+L.%3B+Bellazzi%2C+R.&rft.date=2016&rft.volume=12&rft.issue=1&rft.pages=en10%E2%80%93en14&rft_id=https%3A%2F%2Fwww.ejbi.org%2Farchive%2Fejbi-volume-12-issue-1-year-2016.html&rfr_id=info:sid\/en.wikipedia.org:Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MandelSMART16-11\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-MandelSMART16_11-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Mandel, J.C.; Kreda, D.A.; Mandl, K.D. et al. (2016). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4997036\" target=\"_blank\">\"SMART on FHIR: A standards-based, interoperable apps platform for electronic health records\"<\/a>. <i>JAMIA<\/i> <b>23<\/b> (5): 899-908. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1093%2Fjamia%2Focv189\" target=\"_blank\">10.1093\/jamia\/ocv189<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4997036\/\" target=\"_blank\">PMC4997036<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/26911829\" target=\"_blank\">26911829<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4997036\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4997036<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=SMART+on+FHIR%3A+A+standards-based%2C+interoperable+apps+platform+for+electronic+health+records&rft.jtitle=JAMIA&rft.aulast=Mandel%2C+J.C.%3B+Kreda%2C+D.A.%3B+Mandl%2C+K.D.+et+al.&rft.au=Mandel%2C+J.C.%3B+Kreda%2C+D.A.%3B+Mandl%2C+K.D.+et+al.&rft.date=2016&rft.volume=23&rft.issue=5&rft.pages=899-908&rft_id=info:doi\/10.1093%2Fjamia%2Focv189&rft_id=info:pmc\/PMC4997036&rft_id=info:pmid\/26911829&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4997036&rfr_id=info:sid\/en.wikipedia.org:Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-PelegMaking13-12\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-PelegMaking13_12-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Peleg, M.; Shahar, Y.; Quaglini, S. (2013). <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/joinup.ec.europa.eu\/sites\/default\/files\/document\/2014-06\/ePractice%20Journal-Vol.20-November%202013.pdf\" target=\"_blank\">\"Making healthcare more accessible, better, faster, and cheaper: The MobiGuide Project\"<\/a>. <i>European Journal of ePractice<\/i> (20): 5\u201320<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/joinup.ec.europa.eu\/sites\/default\/files\/document\/2014-06\/ePractice%20Journal-Vol.20-November%202013.pdf\" target=\"_blank\">https:\/\/joinup.ec.europa.eu\/sites\/default\/files\/document\/2014-06\/ePractice%20Journal-Vol.20-November%202013.pdf<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Making+healthcare+more+accessible%2C+better%2C+faster%2C+and+cheaper%3A+The+MobiGuide+Project&rft.jtitle=European+Journal+of+ePractice&rft.aulast=Peleg%2C+M.%3B+Shahar%2C+Y.%3B+Quaglini%2C+S.&rft.au=Peleg%2C+M.%3B+Shahar%2C+Y.%3B+Quaglini%2C+S.&rft.date=2013&rft.issue=20&rft.pages=5%E2%80%9320&rft_id=https%3A%2F%2Fjoinup.ec.europa.eu%2Fsites%2Fdefault%2Ffiles%2Fdocument%2F2014-06%2FePractice%2520Journal-Vol.20-November%25202013.pdf&rfr_id=info:sid\/en.wikipedia.org:Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LenzHealth12-13\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-LenzHealth12_13-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Lenz, R.; Peleg, M.; Reichert, M. (2012). <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.igi-global.com\/journal\/international-journal-knowledge-based-organizations\/1177\" target=\"_blank\">\"Healthcare Process Support: Achievements, Challenges, Current Research\"<\/a>. <i>International Journal of Knowledge-Based Organizations<\/i> <b>2<\/b> (4): i\u2013xvi<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.igi-global.com\/journal\/international-journal-knowledge-based-organizations\/1177\" target=\"_blank\">https:\/\/www.igi-global.com\/journal\/international-journal-knowledge-based-organizations\/1177<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Healthcare+Process+Support%3A+Achievements%2C+Challenges%2C+Current+Research&rft.jtitle=International+Journal+of+Knowledge-Based+Organizations&rft.aulast=Lenz%2C+R.%3B+Peleg%2C+M.%3B+Reichert%2C+M.&rft.au=Lenz%2C+R.%3B+Peleg%2C+M.%3B+Reichert%2C+M.&rft.date=2012&rft.volume=2&rft.issue=4&rft.pages=i%E2%80%93xvi&rft_id=https%3A%2F%2Fwww.igi-global.com%2Fjournal%2Finternational-journal-knowledge-based-organizations%2F1177&rfr_id=info:sid\/en.wikipedia.org:Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-AalstProcess12-14\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-AalstProcess12_14-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">van der Aalst, W.; Adriansyah, A.; Alves de Medeiros, A.K. et al. (2012). \"Process Mining Manifesto\". In Daniel, F.; Barkaoui, K.; Dustdar, S.. <i>Business Process Management Workshops 2011<\/i>. Lecture Notes in Business Information Processing. <b>99<\/b>. Springer. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1007%2F978-3-642-28108-2_19\" target=\"_blank\">10.1007\/978-3-642-28108-2_19<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a> 9783642281082.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Process+Mining+Manifesto&rft.atitle=Business+Process+Management+Workshops+2011&rft.aulast=van+der+Aalst%2C+W.%3B+Adriansyah%2C+A.%3B+Alves+de+Medeiros%2C+A.K.+et+al.&rft.au=van+der+Aalst%2C+W.%3B+Adriansyah%2C+A.%3B+Alves+de+Medeiros%2C+A.K.+et+al.&rft.date=2012&rft.series=Lecture+Notes+in+Business+Information+Processing&rft.volume=99&rft.pub=Springer&rft_id=info:doi\/10.1007%2F978-3-642-28108-2_19&rft.isbn=9783642281082&rfr_id=info:sid\/en.wikipedia.org:Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MansProcess13-15\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-MansProcess13_15-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Mans, R.S.; van der Aalst, W.; Vanwersch, R.J.B. et al. (2013). \"Process Mining in Healthcare: Data Challenges When Answering Frequently Posed Questions\". In Lenz, R.; Miksch, S.; Peleg, M. et al.. <i>Process Support and Knowledge Representation in Health Care<\/i>. Lecture Notes in Computer Science. <b>7738<\/b>. Springer. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1007%2F978-3-642-36438-9_10\" target=\"_blank\">10.1007\/978-3-642-36438-9_10<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a> 9783642364389.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Process+Mining+in+Healthcare%3A+Data+Challenges+When+Answering+Frequently+Posed+Questions&rft.atitle=Process+Support+and+Knowledge+Representation+in+Health+Care&rft.aulast=Mans%2C+R.S.%3B+van+der+Aalst%2C+W.%3B+Vanwersch%2C+R.J.B.+et+al.&rft.au=Mans%2C+R.S.%3B+van+der+Aalst%2C+W.%3B+Vanwersch%2C+R.J.B.+et+al.&rft.date=2013&rft.series=Lecture+Notes+in+Computer+Science&rft.volume=7738&rft.pub=Springer&rft_id=info:doi\/10.1007%2F978-3-642-36438-9_10&rft.isbn=9783642364389&rfr_id=info:sid\/en.wikipedia.org:Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BellazziPred08-16\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BellazziPred08_16-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Bellazzi, R.; Zupan, B. (2008). \"Predictive data mining in clinical medicine: Current issues and guidelines\". <i>International Journal of Medical Informatics<\/i> <b>77<\/b> (2): 81\u201397. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.ijmedinf.2006.11.006\" target=\"_blank\">10.1016\/j.ijmedinf.2006.11.006<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/17188928\" target=\"_blank\">17188928<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Predictive+data+mining+in+clinical+medicine%3A+Current+issues+and+guidelines&rft.jtitle=International+Journal+of+Medical+Informatics&rft.aulast=Bellazzi%2C+R.%3B+Zupan%2C+B.&rft.au=Bellazzi%2C+R.%3B+Zupan%2C+B.&rft.date=2008&rft.volume=77&rft.issue=2&rft.pages=81%E2%80%9397&rft_id=info:doi\/10.1016%2Fj.ijmedinf.2006.11.006&rft_id=info:pmid\/17188928&rfr_id=info:sid\/en.wikipedia.org:Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BellazziPredict11-17\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BellazziPredict11_17-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Bellazzi, R.; Ferrazzi, F.; Sacchi, L. (2011). \"Predictive data mining in clinical medicine: A focus on selected methods and applications\". <i>Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery<\/i> <b>1<\/b> (5): 416\u201330. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1002%2Fwidm.23\" target=\"_blank\">10.1002\/widm.23<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Predictive+data+mining+in+clinical+medicine%3A+A+focus+on+selected+methods+and+applications&rft.jtitle=Wiley+Interdisciplinary+Reviews%3A+Data+Mining+and+Knowledge+Discovery&rft.aulast=Bellazzi%2C+R.%3B+Ferrazzi%2C+F.%3B+Sacchi%2C+L.&rft.au=Bellazzi%2C+R.%3B+Ferrazzi%2C+F.%3B+Sacchi%2C+L.&rft.date=2011&rft.volume=1&rft.issue=5&rft.pages=416%E2%80%9330&rft_id=info:doi\/10.1002%2Fwidm.23&rfr_id=info:sid\/en.wikipedia.org:Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-Dawson-SaundersASurv90-18\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-Dawson-SaundersASurv90_18-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Dawson-Saunders, B.; Feltovich, P.J.; Coulson, R.L.; Steward, D.E. (1990). \"A survey of medical school teachers to identify basic biomedical concepts medical students should understand\". <i>Academic Medicine<\/i> <b>65<\/b> (7): 448\u201354. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/2242199\" target=\"_blank\">2242199<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+survey+of+medical+school+teachers+to+identify+basic+biomedical+concepts+medical+students+should+understand&rft.jtitle=Academic+Medicine&rft.aulast=Dawson-Saunders%2C+B.%3B+Feltovich%2C+P.J.%3B+Coulson%2C+R.L.%3B+Steward%2C+D.E.&rft.au=Dawson-Saunders%2C+B.%3B+Feltovich%2C+P.J.%3B+Coulson%2C+R.L.%3B+Steward%2C+D.E.&rft.date=1990&rft.volume=65&rft.issue=7&rft.pages=448%E2%80%9354&rft_id=info:pmid\/2242199&rfr_id=info:sid\/en.wikipedia.org:Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-Gonz.C3.A1lez-FerrerDiag16-19\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-Gonz.C3.A1lez-FerrerDiag16_19-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Gonz\u00e1lez-Ferrer, A.; Valc\u00e1rcel, M.; Ch\u00e1fer, J. et al. (2016). \"Diagn\u00f3stico y tratamiento de hiponatremia usando modelos computacionales de gu\u00edas de pr\u00e1ctica cl\u00ednica\". <i>Actas del XIX Congreso Nacional de Inform\u00e1tica para la Salud, INFORSALUD<\/i> <b>2016<\/b>: 193\u2013198.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Diagn%C3%B3stico+y+tratamiento+de+hiponatremia+usando+modelos+computacionales+de+gu%C3%ADas+de+pr%C3%A1ctica+cl%C3%ADnica&rft.jtitle=Actas+del+XIX+Congreso+Nacional+de+Inform%C3%A1tica+para+la+Salud%2C+INFORSALUD&rft.aulast=Gonz%C3%A1lez-Ferrer%2C+A.%3B+Valc%C3%A1rcel%2C+M.%3B+Ch%C3%A1fer%2C+J.+et+al.&rft.au=Gonz%C3%A1lez-Ferrer%2C+A.%3B+Valc%C3%A1rcel%2C+M.%3B+Ch%C3%A1fer%2C+J.+et+al.&rft.date=2016&rft.volume=2016&rft.pages=193%E2%80%93198&rfr_id=info:sid\/en.wikipedia.org:Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-Gonz.C3.A1lez-FerrerDev17-20\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-Gonz.C3.A1lez-FerrerDev17_20-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Gonz\u00e1lez-Ferrer, A.; Valc\u00e1rcel, M.; Cuesta, M. et al. (2017). \"Development of a computer-interpretable clinical guideline model for decision support in the differential diagnosis of hyponatremia\". <i>International Journal of Medical Informatics<\/i> <b>103<\/b>: 55\u201364. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.ijmedinf.2017.04.014\" target=\"_blank\">10.1016\/j.ijmedinf.2017.04.014<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/28551002\" target=\"_blank\">28551002<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Development+of+a+computer-interpretable+clinical+guideline+model+for+decision+support+in+the+differential+diagnosis+of+hyponatremia&rft.jtitle=International+Journal+of+Medical+Informatics&rft.aulast=Gonz%C3%A1lez-Ferrer%2C+A.%3B+Valc%C3%A1rcel%2C+M.%3B+Cuesta%2C+M.+et+al.&rft.au=Gonz%C3%A1lez-Ferrer%2C+A.%3B+Valc%C3%A1rcel%2C+M.%3B+Cuesta%2C+M.+et+al.&rft.date=2017&rft.volume=103&rft.pages=55%E2%80%9364&rft_id=info:doi\/10.1016%2Fj.ijmedinf.2017.04.014&rft_id=info:pmid\/28551002&rfr_id=info:sid\/en.wikipedia.org:Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-FoxDissem98-21\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-FoxDissem98_21-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Fox, J.; Johns, N.; Rahmanzadeh, A. et al. (1998). \"Disseminating medical knowledge: The PROforma approach\". <i>Artificial Intelligence in Medicine<\/i> <b>14<\/b> (1\u20132): 157\u201381. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/9779888\" target=\"_blank\">9779888<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Disseminating+medical+knowledge%3A+The+PROforma+approach&rft.jtitle=Artificial+Intelligence+in+Medicine&rft.aulast=Fox%2C+J.%3B+Johns%2C+N.%3B+Rahmanzadeh%2C+A.+et+al.&rft.au=Fox%2C+J.%3B+Johns%2C+N.%3B+Rahmanzadeh%2C+A.+et+al.&rft.date=1998&rft.volume=14&rft.issue=1%E2%80%932&rft.pages=157%E2%80%9381&rft_id=info:pmid\/9779888&rfr_id=info:sid\/en.wikipedia.org:Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-FoxOpen15-22\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-FoxOpen15_22-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Fox, J.; Gutenstein, M.; Khan, O. et al. (2015). \"OpenClinical.net: A platform for creating and sharing knowledge and promoting best practice in healthcare\". <i>Computers in Industry<\/i> <b>66<\/b>: 63\u201372. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.compind.2014.10.001\" target=\"_blank\">10.1016\/j.compind.2014.10.001<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=OpenClinical.net%3A+A+platform+for+creating+and+sharing+knowledge+and+promoting+best+practice+in+healthcare&rft.jtitle=Computers+in+Industry&rft.aulast=Fox%2C+J.%3B+Gutenstein%2C+M.%3B+Khan%2C+O.+et+al.&rft.au=Fox%2C+J.%3B+Gutenstein%2C+M.%3B+Khan%2C+O.+et+al.&rft.date=2015&rft.volume=66&rft.pages=63%E2%80%9372&rft_id=info:doi\/10.1016%2Fj.compind.2014.10.001&rfr_id=info:sid\/en.wikipedia.org:Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-FenskeUtility10-23\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-FenskeUtility10_23-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Fenske, W.; Maier, S.K.; Blechschmidt, A. et al. (2010). \"Utility and limitations of the traditional diagnostic approach to hyponatremia: A diagnostic study\". <i>American Journal of Medicine<\/i> <b>123<\/b> (7): 652\u20137. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.amjmed.2010.01.013\" target=\"_blank\">10.1016\/j.amjmed.2010.01.013<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/20609688\" target=\"_blank\">20609688<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Utility+and+limitations+of+the+traditional+diagnostic+approach+to+hyponatremia%3A+A+diagnostic+study&rft.jtitle=American+Journal+of+Medicine&rft.aulast=Fenske%2C+W.%3B+Maier%2C+S.K.%3B+Blechschmidt%2C+A.+et+al.&rft.au=Fenske%2C+W.%3B+Maier%2C+S.K.%3B+Blechschmidt%2C+A.+et+al.&rft.date=2010&rft.volume=123&rft.issue=7&rft.pages=652%E2%80%937&rft_id=info:doi\/10.1016%2Fj.amjmed.2010.01.013&rft_id=info:pmid\/20609688&rfr_id=info:sid\/en.wikipedia.org:Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-Gonz.C3.A1lez-FerrerComorb-24\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-Gonz.C3.A1lez-FerrerComorb_24-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Gonz\u00e1lez-Ferrer, A.; Valc\u00e1rcel, M.; Cuesta, M. et al.. \"Comorbidities in the Syndrome of Inappropriate Antidiuretic Hormone Secretion: A Hierarchical Clustering Analysis on Discharge Data\". <i>To be published<\/i>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Comorbidities+in+the+Syndrome+of+Inappropriate+Antidiuretic+Hormone+Secretion%3A+A+Hierarchical+Clustering+Analysis+on+Discharge+Data&rft.jtitle=To+be+published&rft.aulast=Gonz%C3%A1lez-Ferrer%2C+A.%3B+Valc%C3%A1rcel%2C+M.%3B+Cuesta%2C+M.+et+al.&rft.au=Gonz%C3%A1lez-Ferrer%2C+A.%3B+Valc%C3%A1rcel%2C+M.%3B+Cuesta%2C+M.+et+al.&rfr_id=info:sid\/en.wikipedia.org:Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ClarkeTesting08-25\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ClarkeTesting08_25-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Clarke, K.R.; Somerfield, P.J.; Gorley, R.N. et al. (2008). \"Testing of null hypotheses in exploratory community analyses: Similarity profiles and biota-environment linkage\". <i>Journal of Experimental Marine Biology and Ecology<\/i> <b>366<\/b> (1\u20132): 56\u201369. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.jembe.2008.07.009\" target=\"_blank\">10.1016\/j.jembe.2008.07.009<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Testing+of+null+hypotheses+in+exploratory+community+analyses%3A+Similarity+profiles+and+biota-environment+linkage&rft.jtitle=Journal+of+Experimental+Marine+Biology+and+Ecology&rft.aulast=Clarke%2C+K.R.%3B+Somerfield%2C+P.J.%3B+Gorley%2C+R.N.+et+al.&rft.au=Clarke%2C+K.R.%3B+Somerfield%2C+P.J.%3B+Gorley%2C+R.N.+et+al.&rft.date=2008&rft.volume=366&rft.issue=1%E2%80%932&rft.pages=56%E2%80%9369&rft_id=info:doi\/10.1016%2Fj.jembe.2008.07.009&rfr_id=info:sid\/en.wikipedia.org:Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-OwenMahout11-26\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-OwenMahout11_26-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Owen, S; Anil, R.; Dunning, T. et al. (2011). <i>Mahout in Action<\/i>. Manning Publications. pp. 416. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a> 9781935182689.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=book&rft.btitle=Mahout+in+Action&rft.aulast=Owen%2C+S%3B+Anil%2C+R.%3B+Dunning%2C+T.+et+al.&rft.au=Owen%2C+S%3B+Anil%2C+R.%3B+Dunning%2C+T.+et+al.&rft.date=2011&rft.pages=pp.%26nbsp%3B416&rft.pub=Manning+Publications&rft.isbn=9781935182689&rfr_id=info:sid\/en.wikipedia.org:Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ENISASmart16-27\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ENISASmart16_27-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.enisa.europa.eu\/publications\/cyber-security-and-resilience-for-smart-hospitals\" target=\"_blank\">\"Smart Hospitals: Security and Resilience for Smart Health Service and Infrastructures\"<\/a>. European Union Agency for Network and Information Security. November 2016. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.2824%2F28801\" target=\"_blank\">10.2824\/28801<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.enisa.europa.eu\/publications\/cyber-security-and-resilience-for-smart-hospitals\" target=\"_blank\">https:\/\/www.enisa.europa.eu\/publications\/cyber-security-and-resilience-for-smart-hospitals<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Smart+Hospitals%3A+Security+and+Resilience+for+Smart+Health+Service+and+Infrastructures&rft.atitle=&rft.date=November+2016&rft.pub=European+Union+Agency+for+Network+and+Information+Security&rft_id=info:doi\/10.2824%2F28801&rft_id=https%3A%2F%2Fwww.enisa.europa.eu%2Fpublications%2Fcyber-security-and-resilience-for-smart-hospitals&rfr_id=info:sid\/en.wikipedia.org:Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-SearaValue16-28\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-SearaValue16_28-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Seara, G.; Pay\u00e1, A.; Mayol, J. (2016). \"Value-based healthcare delivery in the digital era\". <i>European Psychiatry<\/i> <b>33<\/b> (Supplement): S33. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.eurpsy.2016.01.862\" target=\"_blank\">10.1016\/j.eurpsy.2016.01.862<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Value-based+healthcare+delivery+in+the+digital+era&rft.jtitle=European+Psychiatry&rft.aulast=Seara%2C+G.%3B+Pay%C3%A1%2C+A.%3B+Mayol%2C+J.&rft.au=Seara%2C+G.%3B+Pay%C3%A1%2C+A.%3B+Mayol%2C+J.&rft.date=2016&rft.volume=33&rft.issue=Supplement&rft.pages=S33&rft_id=info:doi\/10.1016%2Fj.eurpsy.2016.01.862&rfr_id=info:sid\/en.wikipedia.org:Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-Guti.C3.A9rrezManual13-29\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-Guti.C3.A9rrezManual13_29-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Guti\u00e9rrez, A.R.; Cuenca, G.M.; Acebedo, I.A. et al. (June 2013). [<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/gesdoc.isciii.es\/gesdoccontroller?action=download&id=29\/11\/2013-45c9ee530c\" target=\"_blank\">http:\/\/gesdoc.isciii.es\/gesdoccontroller?action=download&id=29\/11\/2013-45c9ee530c<\/a> \"Manual pr\u00e1ctico\nde interoperabilidad sem\u00e1ntica para entornos sanitarios basada en arquetipos\"] (PDF). Unidad de Investigaci\u00f3n en Telemedicina y e-Salud. pp. 152<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/gesdoc.isciii.es\/gesdoccontroller?action=download&id=29\/11\/2013-45c9ee530c\" target=\"_blank\">http:\/\/gesdoc.isciii.es\/gesdoccontroller?action=download&id=29\/11\/2013-45c9ee530c<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Manual+pr%C3%A1ctico%0Ade+interoperabilidad+sem%C3%A1ntica+para+entornos+sanitarios+basada+en+arquetipos&rft.atitle=&rft.aulast=Guti%C3%A9rrez%2C+A.R.%3B+Cuenca%2C+G.M.%3B+Acebedo%2C+I.A.+et+al.&rft.au=Guti%C3%A9rrez%2C+A.R.%3B+Cuenca%2C+G.M.%3B+Acebedo%2C+I.A.+et+al.&rft.date=June+2013&rft.pages=pp.+152&rft.pub=Unidad+de+Investigaci%C3%B3n+en+Telemedicina+y+e-Salud&rft_id=http%3A%2F%2Fgesdoc.isciii.es%2Fgesdoccontroller%3Faction%3Ddownload%26id%3D29%2F11%2F2013-45c9ee530c&rfr_id=info:sid\/en.wikipedia.org:Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-EURegulation16-30\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-EURegulation16_30-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"http:\/\/eur-lex.europa.eu\/legal-content\/EN\/TXT\/?uri=CELEX:32016R0679\" target=\"_blank\">\"Regulation (EU) 2016\/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95\/46\/EC (General Data Protection Regulation)\"<\/a>. <i>EUR-Lex<\/i>. European Union. 2016<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/eur-lex.europa.eu\/legal-content\/EN\/TXT\/?uri=CELEX:32016R0679\" target=\"_blank\">http:\/\/eur-lex.europa.eu\/legal-content\/EN\/TXT\/?uri=CELEX:32016R0679<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Regulation+%28EU%29+2016%2F679+of+the+European+Parliament+and+of+the+Council+of+27+April+2016+on+the+protection+of+natural+persons+with+regard+to+the+processing+of+personal+data+and+on+the+free+movement+of+such+data%2C+and+repealing+Directive+95%2F46%2FEC+%28General+Data+Protection+Regulation%29&rft.atitle=EUR-Lex&rft.date=2016&rft.pub=European+Union&rft_id=http%3A%2F%2Feur-lex.europa.eu%2Flegal-content%2FEN%2FTXT%2F%3Furi%3DCELEX%3A32016R0679&rfr_id=info:sid\/en.wikipedia.org:Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-Costa-PereiraInform09-31\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-Costa-PereiraInform09_31-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Costa-Pereira, A.; Chen, R.; Almeida, F.C. et al. (2009). \"Chapter 4: Data Quality and Integration Issues in Electronic Health Records\". In Hristidis, V.. <i>Information Discovery on Electronic Health Records<\/i>. Taylor & Francis Group. pp. 55\u201395. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a> 9781420090413.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Chapter+4%3A+Data+Quality+and+Integration+Issues+in+Electronic+Health+Records&rft.atitle=Information+Discovery+on+Electronic+Health+Records&rft.aulast=Costa-Pereira%2C+A.%3B+Chen%2C+R.%3B+Almeida%2C+F.C.+et+al.&rft.au=Costa-Pereira%2C+A.%3B+Chen%2C+R.%3B+Almeida%2C+F.C.+et+al.&rft.date=2009&rft.pages=pp.%26nbsp%3B55%E2%80%9395&rft.pub=Taylor+%26+Francis+Group&rft.isbn=9781420090413&rfr_id=info:sid\/en.wikipedia.org:Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WeiskopfMeth13-32\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WeiskopfMeth13_32-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Weiskopf, N.G.; Weng, C. (2013). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3555312\" target=\"_blank\">\"Methods and dimensions of electronic health record data quality assessment: Enabling reuse for clinical research\"<\/a>. <i>JAMIA<\/i> <b>20<\/b> (1): 144\u201351. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1136%2Famiajnl-2011-000681\" target=\"_blank\">10.1136\/amiajnl-2011-000681<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3555312\/\" target=\"_blank\">PMC3555312<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/22733976\" target=\"_blank\">22733976<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3555312\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3555312<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Methods+and+dimensions+of+electronic+health+record+data+quality+assessment%3A+Enabling+reuse+for+clinical+research&rft.jtitle=JAMIA&rft.aulast=Weiskopf%2C+N.G.%3B+Weng%2C+C.&rft.au=Weiskopf%2C+N.G.%3B+Weng%2C+C.&rft.date=2013&rft.volume=20&rft.issue=1&rft.pages=144%E2%80%9351&rft_id=info:doi\/10.1136%2Famiajnl-2011-000681&rft_id=info:pmc\/PMC3555312&rft_id=info:pmid\/22733976&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3555312&rfr_id=info:sid\/en.wikipedia.org:Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BrownLegal14-33\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BrownLegal14_33-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Brown, S.H.; Miller, R.A. (2014). \"Chapter 26: Legal and Regulatory Issues Related to the Use of Clinical Software in Health Care Delivery\". In Greenes, R.A.. <i>Clinical Decision Support<\/i>. Academic Press. pp. 711\u2013740. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2FB978-0-12-398476-0.00026-9\" target=\"_blank\">10.1016\/B978-0-12-398476-0.00026-9<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a> 9780123984760.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Chapter+26%3A+Legal+and+Regulatory+Issues+Related+to+the+Use+of+Clinical+Software+in+Health+Care+Delivery&rft.atitle=Clinical+Decision+Support&rft.aulast=Brown%2C+S.H.%3B+Miller%2C+R.A.&rft.au=Brown%2C+S.H.%3B+Miller%2C+R.A.&rft.date=2014&rft.pages=pp.%26nbsp%3B711%E2%80%93740&rft.pub=Academic+Press&rft_id=info:doi\/10.1016%2FB978-0-12-398476-0.00026-9&rft.isbn=9780123984760&rfr_id=info:sid\/en.wikipedia.org:Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<\/ol><\/div>\n<h2><span class=\"mw-headline\" id=\"Notes\">Notes<\/span><\/h2>\n<p>This presentation is faithful to the original, with only a few minor changes to grammar, spelling, and presentation, including the addition of PMCID and DOI when they were missing from the original reference. The inline citation for citation 24 was misnumbered in the original text; it's corrected here.\n<\/p>\n<!-- \nNewPP limit report\nCached time: 20181214185738\nCache expiry: 86400\nDynamic content: false\nCPU time usage: 0.747 seconds\nReal time usage: 0.779 seconds\nPreprocessor visited node count: 25782\/1000000\nPreprocessor generated node count: 37200\/1000000\nPost\u2010expand include size: 206542\/2097152 bytes\nTemplate argument size: 73922\/2097152 bytes\nHighest expansion depth: 18\/40\nExpensive parser function count: 0\/100\n-->\n\n<!-- \nTransclusion expansion time report (%,ms,calls,template)\n100.00% 753.513 1 - -total\n 86.60% 652.555 1 - Template:Reflist\n 75.70% 570.438 33 - Template:Citation\/core\n 51.46% 387.728 22 - Template:Cite_journal\n 20.62% 155.347 7 - Template:Cite_book\n 8.57% 64.542 1 - Template:Infobox_journal_article\n 8.27% 62.303 4 - Template:Cite_web\n 8.07% 60.810 1 - Template:Infobox\n 7.51% 56.605 43 - Template:Citation\/identifier\n 4.84% 36.502 80 - Template:Infobox\/row\n-->\n\n<!-- Saved in parser cache with key limswiki:pcache:idhash:10472-0!*!0!!en!5!* and timestamp 20181214185737 and revision id 32716\n -->\n<\/div><div class=\"printfooter\">Source: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare\">https:\/\/www.limswiki.org\/index.php\/Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare<\/a><\/div>\n\t\t\t\t\t\t\t\t\t\t<!-- end content -->\n\t\t\t\t\t\t\t\t\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t\t<!-- end of the left (by default at least) column -->\n\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t\t\n\t\t<\/div>\n\t\t\n\n<\/body>","75f1e35ff0bfbfc1a106a26c2f646394_images":["https:\/\/www.limswiki.org\/images\/5\/5f\/Fig1_Gonz%C3%A1lez-FerrerIJIMAI2018_4-7.png"],"75f1e35ff0bfbfc1a106a26c2f646394_timestamp":1544813857,"a68557faaf217ce0a165c006c2605bb8_type":"article","a68557faaf217ce0a165c006c2605bb8_title":"Information management for enabling systems medicine (Ganzinger and Knaup 2017)","a68557faaf217ce0a165c006c2605bb8_url":"https:\/\/www.limswiki.org\/index.php\/Journal:Information_management_for_enabling_systems_medicine","a68557faaf217ce0a165c006c2605bb8_plaintext":"\n\n\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\n\n\t\t\t\tJournal:Information management for enabling systems medicine\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tFrom LIMSWiki\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\tJump to: navigation, search\n\n\t\t\t\t\t\n\t\t\t\t\tFull article title\n \nInformation management for enabling systems medicineJournal\n \nCurrent Directions in Biomedical EngineeringAuthor(s)\n \nGanzinger, Matthias; Knaup, PetraAuthor affiliation(s)\n \nHeidelberg University's Institute of Medical Biometry and InformaticsPrimary contact\n \nEmail: matthias dot ganzinger at med dot uni-heidelberg dot deYear published\n \n2017Volume and issue\n \n3 (2)Page(s)\n \n0105DOI\n \n10.1515\/cdbme-2017-0105ISSN\n \n2364-5504Distribution license\n \nCreative Commons Attribution-NonCommercial-NoDerivatives 4.0 InternationalWebsite\n \nhttps:\/\/www.degruyter.com\/view\/j\/cdbme.2017.3.issue-2\/cdbme-2017-0105\/cdbme-2017-0105.xmlDownload\n \nhttps:\/\/www.degruyter.com\/downloadpdf\/j\/cdbme.2017.3.issue-2\/cdbme-2017-0105\/cdbme-2017-0105.xml (PDF)\n\nContents\n\n1 Abstract \n2 Introduction \n3 Methods \n4 Results \n\n4.1 Knowledge base \n\n4.1.1 Case base \n4.1.2 Rule base \n\n\n4.2 Inference \n4.3 Prototype implementation \n\n\n5 Discussion \n6 Author's statement \n7 References \n8 Notes \n\n\n\nAbstract \nSystems medicine is a data-oriented approach in research and clinical practice to support the study and treatment of complex diseases. It relies on well-defined information management processes providing comprehensive and up-to-date information as the basis for electronic decision support. The authors suggest a three-layer information technology (IT) architecture for systems medicine and a cyclic data management approach, including a knowledge base that is dynamically updated by extract, transform, and load (ETL) procedures. Decision support is suggested as case-based and rule-based components. Results are presented via a user interface to acknowledging clinical requirements in terms of time and complexity. The systems medicine application was implemented as a prototype. \nKeywords: systems medicine, information management, decision support systems\n\nIntroduction \nSystems medicine is a current approach to aid physicians and researchers in the treatment and investigation of complex diseases. According to the definition of the European Commission, \u201c\u2018Systems medicine\u2018 is the application of systems biology approaches to medical research and medical practice. Its objective is to integrate a variety of biological\/medical data at all relevant levels of cellular organization using the power of computational and mathematical modelling, to enable understanding of the pathophysiological mechanisms, prognosis, diagnosis and treatment of disease.\u201c[1] Consequently, the management of data is of great importance for systems medicine in research as well as clinical practice. Typically, data of different sources such as electronic health record systems, clinical research databases, or biomedical knowledge representations like ontologies have to be reviewed and prepared. The most prevalent data sources in systems medicine research projects are omics data and clinical data.[2]\nDue to the comprehensive approach of systems medicine, neither disease-specific knowledge nor clinical data can be considered static. Thus, we suggest understanding information management for systems medicine as a dynamic process that evolves over time and leads to cyclic updates of the knowledge and data repositories behind the corresponding information technology (IT) system.\nFurther challenges arise from the broad availability of so-called omics data. This class of data \u2014 for example RNA microarray data \u2014 is characterized by a huge amount of attributes per sample that is often disproportional to the number of available cases. Currently, specific data preparation pipelines using statistical approaches like feature selection are necessary to make these data accessible for decision support solutions.[3]\n\nMethods \nFor successful information management in the context of systems medicine, it is useful to distinguish between the IT architecture and the data management process. The architecture depends on the requirements of a specific systems medicine application. As a generic high-level architecture we propose a three-layer model[4]:\n1. Data representation: Data and knowledge from different sources have to be prepared and made available for use in systems medicine. This includes data harmonization, transformation, and storage. \n2. Decision support: Data and knowledge from layer 1 are processed by applying decision support approaches like case-based reasoning (CBR), deductive classifiers (rules-based), or systems biology models. Depending on the context, such components can be combined into hybrid systems.\n3. User interface: Systems medicine applications should be designed to assist and not replace human decisions. Consequently, the user interface for such an application must be carefully designed to support well-informed, reproducible clinical decisions in an appropriate time frame.\nThe complexity of the data management process depends on the level of heterogeneity prevalent in the data sources. To achieve sufficient case numbers, it is often necessary to combine data on the same entity types from different sources. For example, hospitals may decide to collaborate and share clinical data on a specific disease area to build a joint systems medicine application with a higher number of cases and therefore greater statistical power (multi-center approach).\nIn most cases, clinical documentation will not be based on identical specifications. Thus, in a harmonization step, data definitions have to be evaluated for each attribute, both on a syntactic and semantic level. The resulting common data definition should be implemented into an automated extract, transform, and load (ETL) process to facilitate repeated loading of data to keep an up-to-date decision support system.[5][6]\n\nResults \nAn overview of the resulting information management modelis shown in Figure 1. In the following paragraphs the elements of this model are described in detail.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 1: Information management model for systems medicine\n\n\n\nKnowledge base \nThe core concept of the model is the knowledge base, which contains patient and disease related data, as well as formally represented knowledge. As such, it forms a systems medicine model in a broader sense. Specifically, the knowledge base is comprised of a case base and a rule base.\n\nCase base \nThe case base covers information on the available experience in treating patients with a specific disease. Typically, such information is organized as case descriptions. Each case is described by a harmonized set of attributes covering clinical data, omics data, and others. The case base can be used for various research purposes like data mining or construction of systems biology models. In addition, its cases can be used for decision support directly by using the concept of patient similarity.[7] In the field of clinical artificial intelligence systems, patient similarity has been the subject of research for many years, most notably in case-based reasoning.[8]\nIn terms of clinical data, the case base contains typical data like diagnoses, procedures, side effects, and laboratory values. For some diseases, medical images such as computer tomography, magnetic resonance imaging, or microscope images can be included and be processed with corresponding similarity measures. More recently, case bases are enriched by molecular data describing different steps of the genetic process chain from DNA over proteins to cell-level regulatory processes.[3] However, the use of these data in the context of patient similarity is still challenging due to the large number of parameters involved.\n\nRule base \nSince the case base only covers information on an institution\u2019s previous experiences in treating a disease, it might not be comprehensive in terms of current evidence-based medical knowledge. This can be mitigated by adding a rule base to the systems medicine application. Rules formally represent medical knowledge in a way that can be interpreted by a rules engine like HertmiT.[9] Rules can be derived from various sources; medical treatment guidelines are rule sets intended for human interpretation that can be computerized. New findings on the treatment of a disease can be extracted from textual scientific literature by a manual or automated curation process.[10]\nMore suitable are sources that are computer-interpretable by design like gene ontology.[11] Published systems biology models can be part of a rule base in a broader sense since they provide machine-interpretable models, possibly described in systems biology mark-up language (SBML) and processed by a SBML simulation engine like COPASI.[12][13] For a rule base, a continuous curation process has to be established to ensure the timely availability of new knowledge, for example, when new treatment guidelines are published.\n\nInference \nIn contrast to systems biology, where understanding and in silico simulation of biological processes down to the cellular level are in focus, systems medicine always aims at supporting treatment decisions for individual patients. As shown in Figure 1, the knowledge base is the foundation for drawing conclusions for these individual patients. The technical aspects of this inference process differ in accordance with the type of knowledge available. For similarity-based inference methods like CBR, individual case instances are retrieved with a focus on maximizing similarity with the newly presented patient case. Consequently, a new patient has to be described using as many attributes as possible from the set of attributes in the case base. An individual treatment decision is made based on the outcome of the most similar patient from the case base. Especially for life threatening diseases like cancers, only the first treatment approach might be of interest for a newly diagnosed patient since later therapies cannot be considered independent of previous attempts.\nFor rule-based decision support, clinical data of new patients have to be defined in a way comparable to the case-based approaches. This set of individual attributes is used as input for the rules engine or model simulation. The result is a personalized treatment recommendation for the patient. \nNo matter which decision support method was used, the outcome of the treatment should be documented and added to the knowledge base, either as an additional case or by refining the rules and models as the patient population grows.\n\nPrototype implementation \nFor the systems medicine project \u201cclinically applicable, omics-based assessment of survival, side effects, and targets in multiple myeloma\u201d (CLIOMMICS), a prototype of an IT system for systems medicine is being established according to the proposed architecture and data management process. As a disease model the project examines the multiple myeloma, a malignancy of plasma cells in the bone marrow.\nData (clinical parameters and omics data) have been harmonized and stored in a research data warehouse based on the open-source software \u201cInformatics for Integrating Biology & the Bedside\u201d (i2b2).[14] Data harmonization rules are documented as metadata which are used in an automated ETL process to ensure continuous updates of the case base [5]. Data in i2b2 are organized according to the star schema. While this data schema is optimized for analytical queries, for some purposes a flat case-oriented presentation of the data is desirable. We implemented a Generic Case Extractor (GCE) allowing a comprehensive data export as a matrix containing one line per case.[15]\nWhile i2b2 can be used directly through its user interface for research purposes, it is also used as unified source for a case base. On this foundation, a case-based reasoning module was established with help of the Java-based CBR software framework myCBR.[16] To reflect the specific requirements of the cancer, a specific similarity measure based on survival data was developed. The user interface in form of a web portal with dedicated portlets visualizing CBR results is currently being developed. An additional part of the user interface is a report generator for generating medical letters covering results, e.g., for gene expression data.\n\nDiscussion \nInformation management for systems medicine is a demanding task requiring a multi-level approach to build a sustainable infrastructure. Special care has to be taken to address inherent dynamics of data that are used for systems medicine; over time the number of available health records will increase and treatment approaches will change, for example with the availability of new compounds. Such effects will have to be reflected in the corresponding knowledge base, no matter whether a case-based, rule-based, or other concept is implemented. In the authors\u2019 opinion, the effort of harmonizing data for use in systems medicine should not only be used as a basis for clinical decision support but also made available for research (e.g., data mining). One possibility is the establishment of ETL processes for a biomedical data warehouse as suggested in this manuscript.\nEvaluating a systems medicine application is challenging, especially in context of cancer diseases. Common in silico evaluation approaches like splitting patient data sets into training and test cohorts might not be applicable since it is hard to draw conclusions on test patients, who actually received a different treatment than the one the decision support component suggests. Eventually, a prospective controlled clinical trial might be necessary to compare the performance of a systems medicine application against unsupported decision making. However, such a trial will have to pass high ethical barriers.\nFurther research is necessary in the field of human-computer interaction. This is especially important for the field of systems medicine since physicians bear the burden of being responsible for the patient but only have limited resources in terms of time and budget at their disposal. Thus, it is necessary to provide the essence of a complex data analysis process to empower them to make a good decision for the health of patients.\nThe data management model presented here provides a blueprint for building comprehensive knowledge bases as they are required for systems medicine applications. Due to its generic nature, the model can be used with other IT systems as well.\n\nAuthor's statement \nResearch funding: This work was funded by the German Ministry of Education and Research via the e:Med project CLIOMMICS (grant id: 01ZX1609A). \nConflict of interest: Authors state no conflict of interest. \nInformed consent: Informed consent is not applicable. \nEthical approval: The conducted research is not related to either human or animals use.\n\nReferences \n\n\n\u2191 Auffray, C.; Balling, R.; Bensen, M. et al. (15 June 2010). \"From Systems Biology to Systems Medicine\". In Kyriakopoulou, C.; Mulligan, B. (PDF). European Commission. http:\/\/ec.europa.eu\/research\/health\/pdf\/systems-medicine-workshop-report_en.pdf .   \n\n\u2191 Gietzelt, M.; L\u00f6pprich, M.; Karmen, C. et al. (2016). \"Models and Data Sources Used in Systems Medicine: A Systematic Literature Review\". Methods of Information in Medicine 55 (2): 107\u201313. doi:10.3414\/ME15-01-0151. PMID 26846174.   \n\n\u2191 3.0 3.1 Anaissi, A.; Goyal, M.; Catchpoole, D.R. et al. (2015). \"Case-based retrieval framework for gene expression data\". Cancer Informatics 14: 21\u201331. doi:10.4137\/CIN.S22371. PMC PMC4368049. PMID 25861214. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4368049 .   \n\n\u2191 Ganzinger, M.; Gietzelt, M.; Karmen, C. et al. (2015). \"An IT Architecture for Systems Medicine\". Studies in Health Technology and Informatics 210: 185-9. PMID 25991127.   \n\n\u2191 Firnkorn, D.; Ganzinger, M.; Muley, T. et al. (2015). \"A Generic Data Harmonization Process for Cross-linked Research and Network Interaction. Construction and Application for the Lung Cancer Phenotype Database of the German Center for Lung Research\". Methods of Information in Medicine 54 (5): 455-60. doi:10.3414\/ME14-02-0030. PMID 26394900.   \n\n\u2191 Vassiliadis, P. (2009). \"A Survey of Extract\u2013Transform\u2013Load Technology\". International Journal of Data Warehousing and Mining 5 (3): 27. doi:10.4018\/jdwm.2009070101.   \n\n\u2191 Brown, S.A. (2016). \"Patient Similarity: Emerging Concepts in Systems and Precision Medicine\". Frontiers in Physiology 7: 561. doi:10.3389\/fphys.2016.00561. PMC PMC5121278. PMID 27932992. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5121278 .   \n\n\u2191 Aamodt, A.; Plaza, E. (1994). \"Case-based reasoning: foundational issues, methodological variations, and system approaches\". AI Communications 7 (1): 39\u201359.   \n\n\u2191 Motik, B.; Shearer, R.; Horrocks, I. (2009). \"Hypertableau Reasoning for Description Logics\". Journal Of Artificial Intelligence Research 36: 165\u2013228. doi:10.1613\/jair.2811.   \n\n\u2191 Singhal, A.; Simmons, M.; Lu, Z. (2016). \"Text Mining Genotype-Phenotype Relationships from Biomedical Literature for Database Curation and Precision Medicine\". PLoS Computational Biology 12 (11): e1005017. doi:10.1371\/journal.pcbi.1005017. PMC PMC5130168. PMID 27902695. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5130168 .   \n\n\u2191 Gene Ontology Consortium (2008). \"The Gene Ontology project in 2008\". Nucleic Acids Research 36 (DB1): D440-4. doi:10.1093\/nar\/gkm883. PMC PMC2238979. PMID 17984083. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2238979 .   \n\n\u2191 Ghosh, S.; Matsuoka, Y.; Asai, Y. et al. (2011). \"Software for systems biology: From tools to integrated platforms\". Nature Reviews Genetics 12 (12): 821-32. doi:10.1038\/nrg3096. PMID 22048662.   \n\n\u2191 Hucka, M.; Bergmann, F.T.; Hoops, S. et al. (2015). \"The Systems Biology Markup Language (SBML): Language Specification for Level 3 Version 1 Core\". Journal of Integrative Bioinformatics 12 (2): 266. doi:10.2390\/biecoll-jib-2015-266. PMC PMC5451324. PMID 26528564. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5451324 .   \n\n\u2191 Ganslandt, T.; Mate, S.; Helbing, K. et al. (2011). \"Unlocking Data for Clinical Research - The German i2b2 Experience\". Applied Clinical Informatics 2 (1): 116\u201327. doi:10.4338\/ACI-2010-09-CR-0051. PMC PMC3631913. PMID 23616864. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3631913 .   \n\n\u2191 Firnkorn, D.; Merker, S.; Ganzinger, M. et al. (2016). \"Unlocking Data for Statistical Analyses and Data Mining: Generic Case Extraction of Clinical Items from i2b2 and tranSMART\". Studies in Health Technology and Informatics 228: 567\u201371. PMID 27577447.   \n\n\u2191 Bach, K.; Sauer, C.; Althoff, K.-D.; Roth-Berghofer, T. (2014). \"Knowledge Modeling with the Open Source Tool myCBR\". Proceedings of the 10th Workshop on Knowledge Engineering and Software Engineering 2014. https:\/\/sds.dfki.de\/publication\/knowledge-modeling-open-source-tool-mycbr .   \n\n\nNotes \nThis presentation is faithful to the original, with only a few minor changes to grammar, spelling, and presentation, including the addition of PMCID and DOI when they were missing from the original reference.\n\n\n\n\n\n\nSource: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Information_management_for_enabling_systems_medicine\">https:\/\/www.limswiki.org\/index.php\/Journal:Information_management_for_enabling_systems_medicine<\/a>\n\t\t\t\t\tCategories: LIMSwiki journal articles (added in 2018)LIMSwiki journal articles (all)LIMSwiki journal articles on health informatics\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\n\t\t\n\t\t\tNavigation menu\n\t\t\t\t\t\n\t\t\tViews\n\n\t\t\t\n\t\t\t\t\n\t\t\t\tJournal\n\t\t\t\tDiscussion\n\t\t\t\tView source\n\t\t\t\tHistory\n\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\t\n\t\t\t\tPersonal tools\n\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\t\t\tLog in\n\t\t\t\t\t\t\t\t\t\t\t\t\tRequest account\n\t\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\t\n\t\tNavigation\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tMain page\n\t\t\t\t\t\t\t\t\t\t\tRecent changes\n\t\t\t\t\t\t\t\t\t\t\tRandom page\n\t\t\t\t\t\t\t\t\t\t\tHelp\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tSearch\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t \n\t\t\t\t\t\t\n\t\t\t\t\n\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tTools\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tWhat links here\n\t\t\t\t\t\t\t\t\t\t\tRelated changes\n\t\t\t\t\t\t\t\t\t\t\tSpecial pages\n\t\t\t\t\t\t\t\t\t\t\tPermanent link\n\t\t\t\t\t\t\t\t\t\t\tPage information\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\tPrint\/export\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tCreate a book\n\t\t\t\t\t\t\t\t\t\t\tDownload as PDF\n\t\t\t\t\t\t\t\t\t\t\tDownload as Plain text\n\t\t\t\t\t\t\t\t\t\t\tPrintable version\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\n\t\tSponsors\n\t\t\n\t\t\t \r\n\n\t\r\n\n\t\r\n\n\t\r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n\t\n\t\r\n\n\t\r\n\n\t\r\n\n\t\r\n\t\t\n\t\t\n\t\t\t\n\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t This page was last modified on 23 January 2018, at 23:29.\n\t\t\t\t\t\t\t\t\tThis page has been accessed 389 times.\n\t\t\t\t\t\t\t\t\tContent is available under a Creative Commons Attribution-ShareAlike 4.0 International License unless otherwise noted.\n\t\t\t\t\t\t\t\t\tPrivacy policy\n\t\t\t\t\t\t\t\t\tAbout LIMSWiki\n\t\t\t\t\t\t\t\t\tDisclaimers\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\t\n\n","a68557faaf217ce0a165c006c2605bb8_html":"<body class=\"mediawiki ltr sitedir-ltr ns-206 ns-subject page-Journal_Information_management_for_enabling_systems_medicine skin-monobook action-view\">\n<div id=\"rdp-ebb-globalWrapper\">\n\t\t<div id=\"rdp-ebb-column-content\">\n\t\t\t<div id=\"rdp-ebb-content\" class=\"mw-body\" role=\"main\">\n\t\t\t\t<a id=\"rdp-ebb-top\"><\/a>\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t<h1 id=\"rdp-ebb-firstHeading\" class=\"firstHeading\" lang=\"en\">Journal:Information management for enabling systems medicine<\/h1>\n\t\t\t\t\n\t\t\t\t<div id=\"rdp-ebb-bodyContent\" class=\"mw-body-content\">\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\n\t\t\t\t\t<!-- start content -->\n\t\t\t\t\t<div id=\"rdp-ebb-mw-content-text\" lang=\"en\" dir=\"ltr\" class=\"mw-content-ltr\">\n\n\n<h2><span class=\"mw-headline\" id=\"Abstract\">Abstract<\/span><\/h2>\n<p>Systems medicine is a data-oriented approach in research and clinical practice to support the study and treatment of complex diseases. It relies on well-defined information management processes providing comprehensive and up-to-date information as the basis for <a href=\"https:\/\/www.limswiki.org\/index.php\/Clinical_decision_support_system\" title=\"Clinical decision support system\" target=\"_blank\" class=\"wiki-link\" data-key=\"095141425468d057aa977016869ca37d\">electronic decision support<\/a>. The authors suggest a three-layer information technology (IT) architecture for systems medicine and a cyclic data management approach, including a knowledge base that is dynamically updated by extract, transform, and load (ETL) procedures. Decision support is suggested as case-based and rule-based components. Results are presented via a user interface to acknowledging clinical requirements in terms of time and complexity. The systems medicine application was implemented as a prototype. \n<\/p><p><b>Keywords<\/b>: systems medicine, information management, decision support systems\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Introduction\">Introduction<\/span><\/h2>\n<p>Systems medicine is a current approach to aid physicians and researchers in the treatment and investigation of complex diseases. According to the definition of the European Commission, \u201c\u2018Systems medicine\u2018 is the application of systems biology approaches to medical research and medical practice. Its objective is to integrate a variety of biological\/medical data at all relevant levels of cellular organization using the power of computational and mathematical modelling, to enable understanding of the pathophysiological mechanisms, prognosis, diagnosis and treatment of disease.\u201c<sup id=\"rdp-ebb-cite_ref-AuffrayFromSys10_1-0\" class=\"reference\"><a href=\"#cite_note-AuffrayFromSys10-1\" rel=\"external_link\">[1]<\/a><\/sup> Consequently, the management of data is of great importance for systems medicine in research as well as clinical practice. Typically, data of different sources such as <a href=\"https:\/\/www.limswiki.org\/index.php\/Electronic_health_record\" title=\"Electronic health record\" target=\"_blank\" class=\"wiki-link\" data-key=\"f2e31a73217185bb01389404c1fd5255\">electronic health record<\/a> systems, clinical research databases, or biomedical knowledge representations like ontologies have to be reviewed and prepared. The most prevalent data sources in systems medicine research projects are omics data and clinical data.<sup id=\"rdp-ebb-cite_ref-GietzeltModels16_2-0\" class=\"reference\"><a href=\"#cite_note-GietzeltModels16-2\" rel=\"external_link\">[2]<\/a><\/sup>\n<\/p><p>Due to the comprehensive approach of systems medicine, neither disease-specific knowledge nor clinical data can be considered static. Thus, we suggest understanding information management for systems medicine as a dynamic process that evolves over time and leads to cyclic updates of the knowledge and data repositories behind the corresponding information technology (IT) system.\n<\/p><p>Further challenges arise from the broad availability of so-called omics data. This class of data \u2014 for example RNA microarray data \u2014 is characterized by a huge amount of attributes per sample that is often disproportional to the number of available cases. Currently, specific data preparation pipelines using statistical approaches like feature selection are necessary to make these data accessible for decision support solutions.<sup id=\"rdp-ebb-cite_ref-AnaissiCase15_3-0\" class=\"reference\"><a href=\"#cite_note-AnaissiCase15-3\" rel=\"external_link\">[3]<\/a><\/sup>\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Methods\">Methods<\/span><\/h2>\n<p>For successful information management in the context of systems medicine, it is useful to distinguish between the IT architecture and the data management process. The architecture depends on the requirements of a specific systems medicine application. As a generic high-level architecture we propose a three-layer model<sup id=\"rdp-ebb-cite_ref-GanzingerAnIT15_4-0\" class=\"reference\"><a href=\"#cite_note-GanzingerAnIT15-4\" rel=\"external_link\">[4]<\/a><\/sup>:\n<\/p><p>1. Data representation: Data and knowledge from different sources have to be prepared and made available for use in systems medicine. This includes data harmonization, transformation, and storage. \n<\/p><p>2. Decision support: Data and knowledge from layer 1 are processed by applying decision support approaches like case-based reasoning (CBR), deductive classifiers (rules-based), or systems biology models. Depending on the context, such components can be combined into hybrid systems.\n<\/p><p>3. User interface: Systems medicine applications should be designed to assist and not replace human decisions. Consequently, the user interface for such an application must be carefully designed to support well-informed, reproducible clinical decisions in an appropriate time frame.\n<\/p><p>The complexity of the data management process depends on the level of heterogeneity prevalent in the data sources. To achieve sufficient case numbers, it is often necessary to combine data on the same entity types from different sources. For example, hospitals may decide to collaborate and share clinical data on a specific disease area to build a joint systems medicine application with a higher number of cases and therefore greater statistical power (multi-center approach).\n<\/p><p>In most cases, clinical documentation will not be based on identical specifications. Thus, in a harmonization step, data definitions have to be evaluated for each attribute, both on a syntactic and semantic level. The resulting common data definition should be implemented into an automated extract, transform, and load (ETL) process to facilitate repeated loading of data to keep an up-to-date decision support system.<sup id=\"rdp-ebb-cite_ref-FirnkornAGeneric15_5-0\" class=\"reference\"><a href=\"#cite_note-FirnkornAGeneric15-5\" rel=\"external_link\">[5]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-VassiliadisASurv09_6-0\" class=\"reference\"><a href=\"#cite_note-VassiliadisASurv09-6\" rel=\"external_link\">[6]<\/a><\/sup>\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Results\">Results<\/span><\/h2>\n<p>An overview of the resulting information management modelis shown in Figure 1. In the following paragraphs the elements of this model are described in detail.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig1_Ganzinger_CurDirBioEng2017_3-2.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"287e9b38b297c0e52d45939941b42cb5\"><img alt=\"Fig1 Ganzinger CurDirBioEng2017 3-2.png\" src=\"https:\/\/www.limswiki.org\/images\/9\/9a\/Fig1_Ganzinger_CurDirBioEng2017_3-2.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 1:<\/b> Information management model for systems medicine<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h3><span class=\"mw-headline\" id=\"Knowledge_base\">Knowledge base<\/span><\/h3>\n<p>The core concept of the model is the knowledge base, which contains patient and disease related data, as well as formally represented knowledge. As such, it forms a systems medicine model in a broader sense. Specifically, the knowledge base is comprised of a case base and a rule base.\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Case_base\">Case base<\/span><\/h4>\n<p>The case base covers information on the available experience in treating patients with a specific disease. Typically, such information is organized as case descriptions. Each case is described by a harmonized set of attributes covering clinical data, omics data, and others. The case base can be used for various research purposes like data mining or construction of systems biology models. In addition, its cases can be used for decision support directly by using the concept of patient similarity.<sup id=\"rdp-ebb-cite_ref-BrownPatient16_7-0\" class=\"reference\"><a href=\"#cite_note-BrownPatient16-7\" rel=\"external_link\">[7]<\/a><\/sup> In the field of clinical artificial intelligence systems, patient similarity has been the subject of research for many years, most notably in case-based reasoning.<sup id=\"rdp-ebb-cite_ref-AamodtCase94_8-0\" class=\"reference\"><a href=\"#cite_note-AamodtCase94-8\" rel=\"external_link\">[8]<\/a><\/sup>\n<\/p><p>In terms of clinical data, the case base contains typical data like diagnoses, procedures, side effects, and <a href=\"https:\/\/www.limswiki.org\/index.php\/Laboratory\" title=\"Laboratory\" target=\"_blank\" class=\"wiki-link\" data-key=\"c57fc5aac9e4abf31dccae81df664c33\">laboratory<\/a> values. For some diseases, medical images such as computer tomography, magnetic resonance imaging, or microscope images can be included and be processed with corresponding similarity measures. More recently, case bases are enriched by molecular data describing different steps of the genetic process chain from DNA over proteins to cell-level regulatory processes.<sup id=\"rdp-ebb-cite_ref-AnaissiCase15_3-1\" class=\"reference\"><a href=\"#cite_note-AnaissiCase15-3\" rel=\"external_link\">[3]<\/a><\/sup> However, the use of these data in the context of patient similarity is still challenging due to the large number of parameters involved.\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Rule_base\">Rule base<\/span><\/h4>\n<p>Since the case base only covers information on an institution\u2019s previous experiences in treating a disease, it might not be comprehensive in terms of current evidence-based medical knowledge. This can be mitigated by adding a rule base to the systems medicine application. Rules formally represent medical knowledge in a way that can be interpreted by a rules engine like HertmiT.<sup id=\"rdp-ebb-cite_ref-MotikHyper09_9-0\" class=\"reference\"><a href=\"#cite_note-MotikHyper09-9\" rel=\"external_link\">[9]<\/a><\/sup> Rules can be derived from various sources; medical treatment guidelines are rule sets intended for human interpretation that can be computerized. New findings on the treatment of a disease can be extracted from textual scientific literature by a manual or automated curation process.<sup id=\"rdp-ebb-cite_ref-SinghalText16_10-0\" class=\"reference\"><a href=\"#cite_note-SinghalText16-10\" rel=\"external_link\">[10]<\/a><\/sup>\n<\/p><p>More suitable are sources that are computer-interpretable by design like gene ontology.<sup id=\"rdp-ebb-cite_ref-GeneTheGene08_11-0\" class=\"reference\"><a href=\"#cite_note-GeneTheGene08-11\" rel=\"external_link\">[11]<\/a><\/sup> Published systems biology models can be part of a rule base in a broader sense since they provide machine-interpretable models, possibly described in systems biology mark-up language (SBML) and processed by a SBML simulation engine like COPASI.<sup id=\"rdp-ebb-cite_ref-GhoshSoftware11_12-0\" class=\"reference\"><a href=\"#cite_note-GhoshSoftware11-12\" rel=\"external_link\">[12]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-HuckaTheSys15_13-0\" class=\"reference\"><a href=\"#cite_note-HuckaTheSys15-13\" rel=\"external_link\">[13]<\/a><\/sup> For a rule base, a continuous curation process has to be established to ensure the timely availability of new knowledge, for example, when new treatment guidelines are published.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Inference\">Inference<\/span><\/h3>\n<p>In contrast to systems biology, where understanding and <i>in silico<\/i> simulation of biological processes down to the cellular level are in focus, systems medicine always aims at supporting treatment decisions for individual patients. As shown in Figure 1, the knowledge base is the foundation for drawing conclusions for these individual patients. The technical aspects of this inference process differ in accordance with the type of knowledge available. For similarity-based inference methods like CBR, individual case instances are retrieved with a focus on maximizing similarity with the newly presented patient case. Consequently, a new patient has to be described using as many attributes as possible from the set of attributes in the case base. An individual treatment decision is made based on the outcome of the most similar patient from the case base. Especially for life threatening diseases like cancers, only the first treatment approach might be of interest for a newly diagnosed patient since later therapies cannot be considered independent of previous attempts.\n<\/p><p>For rule-based decision support, clinical data of new patients have to be defined in a way comparable to the case-based approaches. This set of individual attributes is used as input for the rules engine or model simulation. The result is a personalized treatment recommendation for the patient. \n<\/p><p>No matter which decision support method was used, the outcome of the treatment should be documented and added to the knowledge base, either as an additional case or by refining the rules and models as the patient population grows.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Prototype_implementation\">Prototype implementation<\/span><\/h3>\n<p>For the systems medicine project \u201cclinically applicable, omics-based assessment of survival, side effects, and targets in multiple myeloma\u201d (CLIOMMICS), a prototype of an IT system for systems medicine is being established according to the proposed architecture and data management process. As a disease model the project examines the multiple myeloma, a malignancy of plasma cells in the bone marrow.\n<\/p><p>Data (clinical parameters and omics data) have been harmonized and stored in a research data warehouse based on the open-source software \u201cInformatics for Integrating Biology & the Bedside\u201d (i2b2).<sup id=\"rdp-ebb-cite_ref-GanslandtUnlocking11_14-0\" class=\"reference\"><a href=\"#cite_note-GanslandtUnlocking11-14\" rel=\"external_link\">[14]<\/a><\/sup> Data harmonization rules are documented as metadata which are used in an automated ETL process to ensure continuous updates of the case base [5]. Data in i2b2 are organized according to the star schema. While this data schema is optimized for analytical queries, for some purposes a flat case-oriented presentation of the data is desirable. We implemented a Generic Case Extractor (GCE) allowing a comprehensive data export as a matrix containing one line per case.<sup id=\"rdp-ebb-cite_ref-FirnkornUnlock16_15-0\" class=\"reference\"><a href=\"#cite_note-FirnkornUnlock16-15\" rel=\"external_link\">[15]<\/a><\/sup>\n<\/p><p>While i2b2 can be used directly through its user interface for research purposes, it is also used as unified source for a case base. On this foundation, a case-based reasoning module was established with help of the Java-based CBR software framework myCBR.<sup id=\"rdp-ebb-cite_ref-BachKnowledge14_16-0\" class=\"reference\"><a href=\"#cite_note-BachKnowledge14-16\" rel=\"external_link\">[16]<\/a><\/sup> To reflect the specific requirements of the cancer, a specific similarity measure based on survival data was developed. The user interface in form of a web portal with dedicated portlets visualizing CBR results is currently being developed. An additional part of the user interface is a report generator for generating medical letters covering results, e.g., for gene expression data.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Discussion\">Discussion<\/span><\/h2>\n<p>Information management for systems medicine is a demanding task requiring a multi-level approach to build a sustainable infrastructure. Special care has to be taken to address inherent dynamics of data that are used for systems medicine; over time the number of available health records will increase and treatment approaches will change, for example with the availability of new compounds. Such effects will have to be reflected in the corresponding knowledge base, no matter whether a case-based, rule-based, or other concept is implemented. In the authors\u2019 opinion, the effort of harmonizing data for use in systems medicine should not only be used as a basis for clinical decision support but also made available for research (e.g., data mining). One possibility is the establishment of ETL processes for a biomedical data warehouse as suggested in this manuscript.\n<\/p><p>Evaluating a systems medicine application is challenging, especially in context of cancer diseases. Common <i>in silico<\/i> evaluation approaches like splitting patient data sets into training and test cohorts might not be applicable since it is hard to draw conclusions on test patients, who actually received a different treatment than the one the decision support component suggests. Eventually, a prospective controlled clinical trial might be necessary to compare the performance of a systems medicine application against unsupported decision making. However, such a trial will have to pass high ethical barriers.\n<\/p><p>Further research is necessary in the field of human-computer interaction. This is especially important for the field of systems medicine since physicians bear the burden of being responsible for the patient but only have limited resources in terms of time and budget at their disposal. Thus, it is necessary to provide the essence of a complex data analysis process to empower them to make a good decision for the health of patients.\n<\/p><p>The data management model presented here provides a blueprint for building comprehensive knowledge bases as they are required for systems medicine applications. Due to its generic nature, the model can be used with other IT systems as well.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Author.27s_statement\">Author's statement<\/span><\/h2>\n<p><b>Research funding<\/b>: This work was funded by the German Ministry of Education and Research via the e:Med project CLIOMMICS (grant id: 01ZX1609A). \n<\/p><p><b>Conflict of interest<\/b>: Authors state no conflict of interest. \n<\/p><p><b>Informed consent<\/b>: Informed consent is not applicable. \n<\/p><p><b>Ethical approval<\/b>: The conducted research is not related to either human or animals use.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"References\">References<\/span><\/h2>\n<div class=\"reflist references-column-width\" style=\"-moz-column-width: 30em; -webkit-column-width: 30em; column-width: 30em; list-style-type: decimal;\">\n<ol class=\"references\">\n<li id=\"cite_note-AuffrayFromSys10-1\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-AuffrayFromSys10_1-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Auffray, C.; Balling, R.; Bensen, M. et al. (15 June 2010). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/ec.europa.eu\/research\/health\/pdf\/systems-medicine-workshop-report_en.pdf\" target=\"_blank\">\"From Systems Biology to Systems Medicine\"<\/a>. In Kyriakopoulou, C.; Mulligan, B. (PDF). European Commission<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/ec.europa.eu\/research\/health\/pdf\/systems-medicine-workshop-report_en.pdf\" target=\"_blank\">http:\/\/ec.europa.eu\/research\/health\/pdf\/systems-medicine-workshop-report_en.pdf<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=From+Systems+Biology+to+Systems+Medicine&rft.atitle=&rft.aulast=Auffray%2C+C.%3B+Balling%2C+R.%3B+Bensen%2C+M.+et+al.&rft.au=Auffray%2C+C.%3B+Balling%2C+R.%3B+Bensen%2C+M.+et+al.&rft.date=15+June+2010&rft.pub=European+Commission&rft_id=http%3A%2F%2Fec.europa.eu%2Fresearch%2Fhealth%2Fpdf%2Fsystems-medicine-workshop-report_en.pdf&rfr_id=info:sid\/en.wikipedia.org:Journal:Information_management_for_enabling_systems_medicine\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GietzeltModels16-2\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-GietzeltModels16_2-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Gietzelt, M.; L\u00f6pprich, M.; Karmen, C. et al. (2016). \"Models and Data Sources Used in Systems Medicine: A Systematic Literature Review\". <i>Methods of Information in Medicine<\/i> <b>55<\/b> (2): 107\u201313. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.3414%2FME15-01-0151\" target=\"_blank\">10.3414\/ME15-01-0151<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/26846174\" target=\"_blank\">26846174<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Models+and+Data+Sources+Used+in+Systems+Medicine%3A+A+Systematic+Literature+Review&rft.jtitle=Methods+of+Information+in+Medicine&rft.aulast=Gietzelt%2C+M.%3B+L%C3%B6pprich%2C+M.%3B+Karmen%2C+C.+et+al.&rft.au=Gietzelt%2C+M.%3B+L%C3%B6pprich%2C+M.%3B+Karmen%2C+C.+et+al.&rft.date=2016&rft.volume=55&rft.issue=2&rft.pages=107%E2%80%9313&rft_id=info:doi\/10.3414%2FME15-01-0151&rft_id=info:pmid\/26846174&rfr_id=info:sid\/en.wikipedia.org:Journal:Information_management_for_enabling_systems_medicine\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-AnaissiCase15-3\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-AnaissiCase15_3-0\" rel=\"external_link\">3.0<\/a><\/sup> <sup><a href=\"#cite_ref-AnaissiCase15_3-1\" rel=\"external_link\">3.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Anaissi, A.; Goyal, M.; Catchpoole, D.R. et al. (2015). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4368049\" target=\"_blank\">\"Case-based retrieval framework for gene expression data\"<\/a>. <i>Cancer Informatics<\/i> <b>14<\/b>: 21\u201331. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.4137%2FCIN.S22371\" target=\"_blank\">10.4137\/CIN.S22371<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4368049\/\" target=\"_blank\">PMC4368049<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/25861214\" target=\"_blank\">25861214<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4368049\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4368049<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Case-based+retrieval+framework+for+gene+expression+data&rft.jtitle=Cancer+Informatics&rft.aulast=Anaissi%2C+A.%3B+Goyal%2C+M.%3B+Catchpoole%2C+D.R.+et+al.&rft.au=Anaissi%2C+A.%3B+Goyal%2C+M.%3B+Catchpoole%2C+D.R.+et+al.&rft.date=2015&rft.volume=14&rft.pages=21%E2%80%9331&rft_id=info:doi\/10.4137%2FCIN.S22371&rft_id=info:pmc\/PMC4368049&rft_id=info:pmid\/25861214&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4368049&rfr_id=info:sid\/en.wikipedia.org:Journal:Information_management_for_enabling_systems_medicine\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GanzingerAnIT15-4\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-GanzingerAnIT15_4-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Ganzinger, M.; Gietzelt, M.; Karmen, C. et al. (2015). \"An IT Architecture for Systems Medicine\". <i>Studies in Health Technology and Informatics<\/i> <b>210<\/b>: 185-9. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/25991127\" target=\"_blank\">25991127<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=An+IT+Architecture+for+Systems+Medicine&rft.jtitle=Studies+in+Health+Technology+and+Informatics&rft.aulast=Ganzinger%2C+M.%3B+Gietzelt%2C+M.%3B+Karmen%2C+C.+et+al.&rft.au=Ganzinger%2C+M.%3B+Gietzelt%2C+M.%3B+Karmen%2C+C.+et+al.&rft.date=2015&rft.volume=210&rft.pages=185-9&rft_id=info:pmid\/25991127&rfr_id=info:sid\/en.wikipedia.org:Journal:Information_management_for_enabling_systems_medicine\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-FirnkornAGeneric15-5\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-FirnkornAGeneric15_5-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Firnkorn, D.; Ganzinger, M.; Muley, T. et al. (2015). \"A Generic Data Harmonization Process for Cross-linked Research and Network Interaction. Construction and Application for the Lung Cancer Phenotype Database of the German Center for Lung Research\". <i>Methods of Information in Medicine<\/i> <b>54<\/b> (5): 455-60. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.3414%2FME14-02-0030\" target=\"_blank\">10.3414\/ME14-02-0030<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/26394900\" target=\"_blank\">26394900<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Generic+Data+Harmonization+Process+for+Cross-linked+Research+and+Network+Interaction.+Construction+and+Application+for+the+Lung+Cancer+Phenotype+Database+of+the+German+Center+for+Lung+Research&rft.jtitle=Methods+of+Information+in+Medicine&rft.aulast=Firnkorn%2C+D.%3B+Ganzinger%2C+M.%3B+Muley%2C+T.+et+al.&rft.au=Firnkorn%2C+D.%3B+Ganzinger%2C+M.%3B+Muley%2C+T.+et+al.&rft.date=2015&rft.volume=54&rft.issue=5&rft.pages=455-60&rft_id=info:doi\/10.3414%2FME14-02-0030&rft_id=info:pmid\/26394900&rfr_id=info:sid\/en.wikipedia.org:Journal:Information_management_for_enabling_systems_medicine\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-VassiliadisASurv09-6\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-VassiliadisASurv09_6-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Vassiliadis, P. (2009). \"A Survey of Extract\u2013Transform\u2013Load Technology\". <i>International Journal of Data Warehousing and Mining<\/i> <b>5<\/b> (3): 27. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.4018%2Fjdwm.2009070101\" target=\"_blank\">10.4018\/jdwm.2009070101<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Survey+of+Extract%E2%80%93Transform%E2%80%93Load+Technology&rft.jtitle=International+Journal+of+Data+Warehousing+and+Mining&rft.aulast=Vassiliadis%2C+P.&rft.au=Vassiliadis%2C+P.&rft.date=2009&rft.volume=5&rft.issue=3&rft.pages=27&rft_id=info:doi\/10.4018%2Fjdwm.2009070101&rfr_id=info:sid\/en.wikipedia.org:Journal:Information_management_for_enabling_systems_medicine\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BrownPatient16-7\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BrownPatient16_7-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Brown, S.A. (2016). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5121278\" target=\"_blank\">\"Patient Similarity: Emerging Concepts in Systems and Precision Medicine\"<\/a>. <i>Frontiers in Physiology<\/i> <b>7<\/b>: 561. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.3389%2Ffphys.2016.00561\" target=\"_blank\">10.3389\/fphys.2016.00561<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC5121278\/\" target=\"_blank\">PMC5121278<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/27932992\" target=\"_blank\">27932992<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5121278\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5121278<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Patient+Similarity%3A+Emerging+Concepts+in+Systems+and+Precision+Medicine&rft.jtitle=Frontiers+in+Physiology&rft.aulast=Brown%2C+S.A.&rft.au=Brown%2C+S.A.&rft.date=2016&rft.volume=7&rft.pages=561&rft_id=info:doi\/10.3389%2Ffphys.2016.00561&rft_id=info:pmc\/PMC5121278&rft_id=info:pmid\/27932992&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC5121278&rfr_id=info:sid\/en.wikipedia.org:Journal:Information_management_for_enabling_systems_medicine\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-AamodtCase94-8\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-AamodtCase94_8-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Aamodt, A.; Plaza, E. (1994). \"Case-based reasoning: foundational issues, methodological variations, and system approaches\". <i>AI Communications<\/i> <b>7<\/b> (1): 39\u201359.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Case-based+reasoning%3A+foundational+issues%2C+methodological+variations%2C+and+system+approaches&rft.jtitle=AI+Communications&rft.aulast=Aamodt%2C+A.%3B+Plaza%2C+E.&rft.au=Aamodt%2C+A.%3B+Plaza%2C+E.&rft.date=1994&rft.volume=7&rft.issue=1&rft.pages=39%E2%80%9359&rfr_id=info:sid\/en.wikipedia.org:Journal:Information_management_for_enabling_systems_medicine\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MotikHyper09-9\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-MotikHyper09_9-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Motik, B.; Shearer, R.; Horrocks, I. (2009). \"Hypertableau Reasoning for Description Logics\". <i>Journal Of Artificial Intelligence Research<\/i> <b>36<\/b>: 165\u2013228. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1613%2Fjair.2811\" target=\"_blank\">10.1613\/jair.2811<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Hypertableau+Reasoning+for+Description+Logics&rft.jtitle=Journal+Of+Artificial+Intelligence+Research&rft.aulast=Motik%2C+B.%3B+Shearer%2C+R.%3B+Horrocks%2C+I.&rft.au=Motik%2C+B.%3B+Shearer%2C+R.%3B+Horrocks%2C+I.&rft.date=2009&rft.volume=36&rft.pages=165%E2%80%93228&rft_id=info:doi\/10.1613%2Fjair.2811&rfr_id=info:sid\/en.wikipedia.org:Journal:Information_management_for_enabling_systems_medicine\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-SinghalText16-10\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-SinghalText16_10-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Singhal, A.; Simmons, M.; Lu, Z. (2016). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5130168\" target=\"_blank\">\"Text Mining Genotype-Phenotype Relationships from Biomedical Literature for Database Curation and Precision Medicine\"<\/a>. <i>PLoS Computational Biology<\/i> <b>12<\/b> (11): e1005017. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1371%2Fjournal.pcbi.1005017\" target=\"_blank\">10.1371\/journal.pcbi.1005017<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC5130168\/\" target=\"_blank\">PMC5130168<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/27902695\" target=\"_blank\">27902695<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5130168\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5130168<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Text+Mining+Genotype-Phenotype+Relationships+from+Biomedical+Literature+for+Database+Curation+and+Precision+Medicine&rft.jtitle=PLoS+Computational+Biology&rft.aulast=Singhal%2C+A.%3B+Simmons%2C+M.%3B+Lu%2C+Z.&rft.au=Singhal%2C+A.%3B+Simmons%2C+M.%3B+Lu%2C+Z.&rft.date=2016&rft.volume=12&rft.issue=11&rft.pages=e1005017&rft_id=info:doi\/10.1371%2Fjournal.pcbi.1005017&rft_id=info:pmc\/PMC5130168&rft_id=info:pmid\/27902695&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC5130168&rfr_id=info:sid\/en.wikipedia.org:Journal:Information_management_for_enabling_systems_medicine\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GeneTheGene08-11\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-GeneTheGene08_11-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Gene Ontology Consortium (2008). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2238979\" target=\"_blank\">\"The Gene Ontology project in 2008\"<\/a>. <i>Nucleic Acids Research<\/i> <b>36<\/b> (DB1): D440-4. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1093%2Fnar%2Fgkm883\" target=\"_blank\">10.1093\/nar\/gkm883<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC2238979\/\" target=\"_blank\">PMC2238979<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/17984083\" target=\"_blank\">17984083<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2238979\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2238979<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+Gene+Ontology+project+in+2008&rft.jtitle=Nucleic+Acids+Research&rft.aulast=Gene+Ontology+Consortium&rft.au=Gene+Ontology+Consortium&rft.date=2008&rft.volume=36&rft.issue=DB1&rft.pages=D440-4&rft_id=info:doi\/10.1093%2Fnar%2Fgkm883&rft_id=info:pmc\/PMC2238979&rft_id=info:pmid\/17984083&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC2238979&rfr_id=info:sid\/en.wikipedia.org:Journal:Information_management_for_enabling_systems_medicine\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GhoshSoftware11-12\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-GhoshSoftware11_12-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Ghosh, S.; Matsuoka, Y.; Asai, Y. et al. (2011). \"Software for systems biology: From tools to integrated platforms\". <i>Nature Reviews Genetics<\/i> <b>12<\/b> (12): 821-32. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1038%2Fnrg3096\" target=\"_blank\">10.1038\/nrg3096<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/22048662\" target=\"_blank\">22048662<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Software+for+systems+biology%3A+From+tools+to+integrated+platforms&rft.jtitle=Nature+Reviews+Genetics&rft.aulast=Ghosh%2C+S.%3B+Matsuoka%2C+Y.%3B+Asai%2C+Y.+et+al.&rft.au=Ghosh%2C+S.%3B+Matsuoka%2C+Y.%3B+Asai%2C+Y.+et+al.&rft.date=2011&rft.volume=12&rft.issue=12&rft.pages=821-32&rft_id=info:doi\/10.1038%2Fnrg3096&rft_id=info:pmid\/22048662&rfr_id=info:sid\/en.wikipedia.org:Journal:Information_management_for_enabling_systems_medicine\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HuckaTheSys15-13\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HuckaTheSys15_13-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Hucka, M.; Bergmann, F.T.; Hoops, S. et al. (2015). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5451324\" target=\"_blank\">\"The Systems Biology Markup Language (SBML): Language Specification for Level 3 Version 1 Core\"<\/a>. <i>Journal of Integrative Bioinformatics<\/i> <b>12<\/b> (2): 266. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.2390%2Fbiecoll-jib-2015-266\" target=\"_blank\">10.2390\/biecoll-jib-2015-266<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC5451324\/\" target=\"_blank\">PMC5451324<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/26528564\" target=\"_blank\">26528564<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5451324\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5451324<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+Systems+Biology+Markup+Language+%28SBML%29%3A+Language+Specification+for+Level+3+Version+1+Core&rft.jtitle=Journal+of+Integrative+Bioinformatics&rft.aulast=Hucka%2C+M.%3B+Bergmann%2C+F.T.%3B+Hoops%2C+S.+et+al.&rft.au=Hucka%2C+M.%3B+Bergmann%2C+F.T.%3B+Hoops%2C+S.+et+al.&rft.date=2015&rft.volume=12&rft.issue=2&rft.pages=266&rft_id=info:doi\/10.2390%2Fbiecoll-jib-2015-266&rft_id=info:pmc\/PMC5451324&rft_id=info:pmid\/26528564&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC5451324&rfr_id=info:sid\/en.wikipedia.org:Journal:Information_management_for_enabling_systems_medicine\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GanslandtUnlocking11-14\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-GanslandtUnlocking11_14-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Ganslandt, T.; Mate, S.; Helbing, K. et al. (2011). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3631913\" target=\"_blank\">\"Unlocking Data for Clinical Research - The German i2b2 Experience\"<\/a>. <i>Applied Clinical Informatics<\/i> <b>2<\/b> (1): 116\u201327. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.4338%2FACI-2010-09-CR-0051\" target=\"_blank\">10.4338\/ACI-2010-09-CR-0051<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3631913\/\" target=\"_blank\">PMC3631913<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/23616864\" target=\"_blank\">23616864<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3631913\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3631913<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Unlocking+Data+for+Clinical+Research+-+The+German+i2b2+Experience&rft.jtitle=Applied+Clinical+Informatics&rft.aulast=Ganslandt%2C+T.%3B+Mate%2C+S.%3B+Helbing%2C+K.+et+al.&rft.au=Ganslandt%2C+T.%3B+Mate%2C+S.%3B+Helbing%2C+K.+et+al.&rft.date=2011&rft.volume=2&rft.issue=1&rft.pages=116%E2%80%9327&rft_id=info:doi\/10.4338%2FACI-2010-09-CR-0051&rft_id=info:pmc\/PMC3631913&rft_id=info:pmid\/23616864&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3631913&rfr_id=info:sid\/en.wikipedia.org:Journal:Information_management_for_enabling_systems_medicine\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-FirnkornUnlock16-15\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-FirnkornUnlock16_15-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Firnkorn, D.; Merker, S.; Ganzinger, M. et al. (2016). \"Unlocking Data for Statistical Analyses and Data Mining: Generic Case Extraction of Clinical Items from i2b2 and tranSMART\". <i>Studies in Health Technology and Informatics<\/i> <b>228<\/b>: 567\u201371. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/27577447\" target=\"_blank\">27577447<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Unlocking+Data+for+Statistical+Analyses+and+Data+Mining%3A+Generic+Case+Extraction+of+Clinical+Items+from+i2b2+and+tranSMART&rft.jtitle=Studies+in+Health+Technology+and+Informatics&rft.aulast=Firnkorn%2C+D.%3B+Merker%2C+S.%3B+Ganzinger%2C+M.+et+al.&rft.au=Firnkorn%2C+D.%3B+Merker%2C+S.%3B+Ganzinger%2C+M.+et+al.&rft.date=2016&rft.volume=228&rft.pages=567%E2%80%9371&rft_id=info:pmid\/27577447&rfr_id=info:sid\/en.wikipedia.org:Journal:Information_management_for_enabling_systems_medicine\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BachKnowledge14-16\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BachKnowledge14_16-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Bach, K.; Sauer, C.; Althoff, K.-D.; Roth-Berghofer, T. (2014). <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/sds.dfki.de\/publication\/knowledge-modeling-open-source-tool-mycbr\" target=\"_blank\">\"Knowledge Modeling with the Open Source Tool myCBR\"<\/a>. <i>Proceedings of the 10th Workshop on Knowledge Engineering and Software Engineering<\/i> <b>2014<\/b><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/sds.dfki.de\/publication\/knowledge-modeling-open-source-tool-mycbr\" target=\"_blank\">https:\/\/sds.dfki.de\/publication\/knowledge-modeling-open-source-tool-mycbr<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Knowledge+Modeling+with+the+Open+Source+Tool+myCBR&rft.jtitle=Proceedings+of+the+10th+Workshop+on+Knowledge+Engineering+and+Software+Engineering&rft.aulast=Bach%2C+K.%3B+Sauer%2C+C.%3B+Althoff%2C+K.-D.%3B+Roth-Berghofer%2C+T.&rft.au=Bach%2C+K.%3B+Sauer%2C+C.%3B+Althoff%2C+K.-D.%3B+Roth-Berghofer%2C+T.&rft.date=2014&rft.volume=2014&rft_id=https%3A%2F%2Fsds.dfki.de%2Fpublication%2Fknowledge-modeling-open-source-tool-mycbr&rfr_id=info:sid\/en.wikipedia.org:Journal:Information_management_for_enabling_systems_medicine\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<\/ol><\/div>\n<h2><span class=\"mw-headline\" id=\"Notes\">Notes<\/span><\/h2>\n<p>This presentation is faithful to the original, with only a few minor changes to grammar, spelling, and presentation, including the addition of PMCID and DOI when they were missing from the original reference.\n<\/p>\n<!-- \nNewPP limit report\nCached time: 20181214185737\nCache expiry: 86400\nDynamic content: false\nCPU time usage: 0.447 seconds\nReal time usage: 0.508 seconds\nPreprocessor visited node count: 14048\/1000000\nPreprocessor generated node count: 31301\/1000000\nPost\u2010expand include size: 111793\/2097152 bytes\nTemplate argument size: 37750\/2097152 bytes\nHighest expansion depth: 18\/40\nExpensive parser function count: 0\/100\n-->\n\n<!-- \nTransclusion expansion time report (%,ms,calls,template)\n100.00% 451.820 1 - -total\n 81.40% 367.760 1 - Template:Reflist\n 70.46% 318.348 16 - Template:Citation\/core\n 67.53% 305.097 15 - Template:Cite_journal\n 13.95% 63.023 1 - Template:Infobox_journal_article\n 13.40% 60.527 1 - Template:Infobox\n 9.81% 44.315 28 - Template:Citation\/identifier\n 8.03% 36.262 80 - Template:Infobox\/row\n 7.36% 33.244 1 - Template:Cite_web\n 3.75% 16.937 16 - Template:Citation\/make_link\n-->\n\n<!-- Saved in parser cache with key limswiki:pcache:idhash:10404-0!*!0!!en!5!* and timestamp 20181214185736 and revision id 32406\n -->\n<\/div><div class=\"printfooter\">Source: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Information_management_for_enabling_systems_medicine\">https:\/\/www.limswiki.org\/index.php\/Journal:Information_management_for_enabling_systems_medicine<\/a><\/div>\n\t\t\t\t\t\t\t\t\t\t<!-- end content -->\n\t\t\t\t\t\t\t\t\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t\t<!-- end of the left (by default at least) column -->\n\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t\t\n\t\t<\/div>\n\t\t\n\n<\/body>","a68557faaf217ce0a165c006c2605bb8_images":["https:\/\/www.limswiki.org\/images\/9\/9a\/Fig1_Ganzinger_CurDirBioEng2017_3-2.png"],"a68557faaf217ce0a165c006c2605bb8_timestamp":1544813856,"428fb6eb50c74d741daa88c4061eeab2_type":"article","428fb6eb50c74d741daa88c4061eeab2_title":"Evidence-based design and evaluation of a whole genome sequencing clinical report for the reference microbiology laboratory (Crisan et al. 2017)","428fb6eb50c74d741daa88c4061eeab2_url":"https:\/\/www.limswiki.org\/index.php\/Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory","428fb6eb50c74d741daa88c4061eeab2_plaintext":"\n\n\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\n\n\t\t\t\tJournal:Evidence-based design and evaluation of a whole genome sequencing clinical report for the reference microbiology laboratory\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tFrom LIMSWiki\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\tJump to: navigation, search\n\n\t\t\t\t\t\n\t\t\t\t\tFull article title\n \nEvidence-based design and evaluation of a whole genome sequencing clinical report for the reference microbiology laboratoryJournal\n \nPeerJAuthor(s)\n \nCrisan, Anamaria; McKee, Geoffrey; Munzner, Tamara; Gardy, Jennifer L.Author affiliation(s)\n \nUniversity of British Columbia, British Columbia Centre for Disease ControlPrimary contact\n \nEmail: jennifer dot gardy at bccdc dot caEditors\n \nSmidt, H.Year published\n \n2018Volume and issue\n \n6Page(s)\n \ne4218DOI\n \n10.7717\/peerj.4218ISSN\n \n2167-8359Distribution license\n \nCreative Commons Attribution 4.0 InternationalWebsite\n \nhttps:\/\/peerj.com\/articles\/4218\/Download\n \nhttps:\/\/peerj.com\/articles\/4218.pdf (PDF)\n\n\n\n\n \n This article contains rendered mathematical formulae. You may require the Math Anywhere plugin for Chrome or the Native MathML add-on and fonts for Firefox if they don't render properly for you. \n\n\nContents\n\n1 Abstract \n2 Introduction \n\n2.1 Human-centered design in the clinical laboratory \n2.2 Collaboration context\u2014COMPASS-TB \n\n\n3 Materials and methods \n\n3.1 Overview of design study methodology \n3.2 Discovery stage \n3.3 Design stage \n3.4 Implementation stage \n\n\n4 Results \n\n4.1 Experts emphasized prioritizing information and revealed constraints \n4.2 Experts vary in their perception of different data types \n4.3 WGS data is vital, but some lack confidence in its interpretation \n4.4 Respondent consensus suggests a role for WGS in diagnosis and treatment tasks \n4.5 Prototyping via a design sprint produces a range of design alternatives \n4.6 The design choice questionnaire quantifies participant preferences for specific design elements \n4.7 Qualitative data affords additional insights into report design \n4.8 Developing a final report template \n\n\n5 Discussion \n6 Conclusions \n7 Supplemental information \n8 Additional information and declarations \n\n8.1 Competing interests \n8.2 Author contributions \n8.3 Human ethics \n8.4 Data aAvailability \n8.5 Funding \n\n\n9 Acknowledgements \n10 References \n11 Notes \n\n\n\nAbstract \nBackground: Microbial genome sequencing is now being routinely used in many clinical and public health laboratories. Understanding how to report complex genomic test results to stakeholders who may have varying familiarity with genomics \u2014 including clinicians, laboratorians, epidemiologists, and researchers \u2014 is critical to the successful and sustainable implementation of this new technology; however, there are no evidence-based guidelines for designing such a report in the pathogen genomics domain. Here, we describe an iterative, human-centered approach to creating a report template for communicating tuberculosis (TB) genomic test results.\nMethods: We used design study methodology \u2014 a human centered approach drawn from the information visualization domain \u2014 to redesign an existing clinical report. We used expert consults and an online questionnaire to discover various stakeholders\u2019 needs around the types of data and tasks related to TB that they encounter in their daily workflow. We also evaluated their perceptions of and familiarity with genomic data, as well as its utility at various clinical decision points. These data shaped the design of multiple prototype reports that were compared against the existing report through a second online survey, with the resulting qualitative and quantitative data informing the final, redesigned, report.\nResults: We recruited 78 participants, 65 of whom were clinicians, nurses, laboratorians, researchers, and epidemiologists involved in TB diagnosis, treatment, and\/or surveillance. Our first survey indicated that participants were largely enthusiastic about genomic data, with the majority agreeing on its utility for certain TB diagnosis and treatment tasks and many reporting some confidence in their ability to interpret this type of data (between 58.8% and 94.1%, depending on the specific data type). When we compared our four prototype reports against the existing design, we found that for the majority (86.7%) of design comparisons, participants preferred the alternative prototype designs over the existing version, and that both clinicians and non-clinicians expressed similar design preferences. Participants showed clearer design preferences when asked to compare individual design elements versus entire reports. Both the quantitative and qualitative data informed the design of a revised report, available online as a LaTeX template.\nConclusions: We show how a human-centered design approach integrating quantitative and qualitative feedback can be used to design an alternative report for representing complex microbial genomic data. We suggest experimental and design guidelines to inform future design studies in the bioinformatics and microbial genomics domains. We also suggest that this type of mixed-methods study is important to facilitate the successful translation of pathogen genomics in the clinic, not only for clinical reports but also more complex bioinformatics data visualization software.\nKeywords: human-centered design, next-generation sequencing, report, tuberculosis, genome\n\nIntroduction \nWhole genome sequencing (WGS) is quickly moving from proof-of-concept research into routine clinical and public health use. WGS can diagnose infections at least as accurately as current protocols[1][2], can predict antimicrobial resistance phenotypes for certain drugs[3][4][5] with high concordance to culture-based testing methods, and can be used in outbreak surveillance to resolve transmission clusters at a resolution not possible with existing genomic or epidemiological methods.[6] Importantly, WGS offers faster turnaround times compared to many culture-based tests, particularly for antimicrobial resistance testing in slow-growing bacteria.\nAs reference microbiology laboratories move towards accreditation of WGS for routine clinical use, the community is turning its attention toward standardization, developing standard operating procedures for reproducible sample handling, sequencing, and downstream bioinformatics analysis.[7][8] Reporting genomic microbiology test results in a way that is interpretable by clinicians, nurses, laboratory staff, researchers, and surveillance experts and that meets regulatory requirements is equally important; however, relatively little effort has been directed toward this area. WGS clinical reports are often produced in-house on an ad hoc, project-by-project basis, with the resulting product not necessarily meeting the needs of the many stakeholders using the report in their clinical and surveillance workflows.\n\nHuman-centered design in the clinical laboratory \nThe information visualization, human\u2013computer interaction, and usability engineering fields offer techniques and design guidelines that have informed bioinformatics tools, including Disease View[9] for exploring host-pathogen interaction data and Microreact[10] for visualizing phylogenetic trees in the context of epidemiological or clinical data. Although the public health community is beginning to recognize the potential role of visualization and analytics in daily laboratory workflows[11] these techniques have not yet been applied to routine reporting of microbiological test results. However, work from the human health domain \u2014 particularly the formatting and display of pathology reports, where standardization is critical[12] \u2014 sheds light on the complex task of clinical report design.\nValenstein reports four principles for organizing an effective pathology report: use headlines to emphasize key points, ensure design continuity over time and relative to other reports, consider information density, and reduce clutter[13], while Renshaw et al.[14] note that when pathology report templates were reformatted with numbering and bolding to highlight required information, template completion rates rose from 84 to 98%. Fixed, consistent layout of medical record elements, highlighting of data relative to background text, and single-page layout improve clinicians\u2019 ability to locate information[15], while information design principles, including visually structuring the document to separate different elements and organizing information to meet the needs of multiple stakeholder types, can reduce the number of errors in data interpretation.[16]\nWork in the electronic health record (EHR) and patient risk communication domains has also provided insight into not just the final product but also the process of effective design. Through quantitative and qualitative evaluations, research has shown that some EHRs are difficult to use because they were not designed to support clinical tasks and information retrieval, but rather data entry.[16] Reviews of the risk communication literature note that while many visual aids improve patients\u2019 understanding of risk[17], the design features that viewers preferred \u2014 namely simplistic, minimalist designs \u2014 were not necessarily those that led to an accurate interpretation of the underlying data.[18] Together, these gaps indicate a need for a human-centered, participatory approach iteratively incorporating both design and evaluation.[19][20]\n\nCollaboration context\u2014COMPASS-TB \nThe COMPASS-TB project was a proof-of-concept study demonstrating the feasibility and utility of WGS for diagnosing tuberculosis (TB) infection, evaluating an isolate\u2019s antimicrobial sensitivity\/resistance and genotyping the isolate to identify epidemiologically related cases.[4] On the basis of COMPASS-TB\u2019s results, Public Health England (PHE) has implemented routine WGS in the TB reference laboratory[21]; however, this requires changing how mycobacteriology results are reported to clinical and public health stakeholders. The COMPASS-TB pilot used reports designed by the project team, but as clinical implementation within PHE progressed, team members expressed an interest in redesigning the report (Fig. 1) to facilitate interpretation of this new data type and align laboratory reporting practices with the needs of multiple TB stakeholders.\n\r\n\n\n\n\n\n\n\n\n\n\n Fig. 1 An earlier COMPASS-TB report design\n\n\n\nWe undertook a mixed-methods and iterative human-centered approach to inform the design and evaluation of a clinical TB WGS report. Specifically, we chose to use design study methodology[22], an approach adopted from the information visualization discipline. When using a design study methodology approach, researchers examine a problem faced by a group of domain specialists, explore their available data and the tasks they perform in reference to that problem, create a product \u2014 in our case a report, but, in the more general case, a visualization system \u2014 to help solve the problem, assess the product with domain specialists, and reflect on the process to improve future design activities. Compared to an ad hoc approach to design, design study methodology engages domain specialists and grounds the design and evaluation of the visualization system in tasks \u2014 in this case TB diagnosis, treatment, and surveillance \u2014 as well as data. It is this marriage of data and tasks to design choices, informed by real needs and supported by empirical evidence, that results in a final product that is relevant, usable, and interpretable.\nHere we describe our application of design study methodology to the COMPASS-TB report redesign. Targeting clinical and public health stakeholders with at least some familiarity with public health genomics, we show how evidence-based design can be incorporated into the emerging field of clinical microbial genomics and present a final report template, which may be ported to other organisms. We also recommend a set of guidelines to support future applications of human-centered design in microbial genomics, whether for report designs or for more complex bioinformatics visualization software.\n\nMaterials and methods \nOverview of design study methodology \nThe design study methodology[22] is an iterative framework outlining an approach to human-centered visualization design and evaluation. It consists of three phases \u2014 precondition, core analysis, and reflection \u2014 that together comprise nine stages. The precondition and reflection phases focus on establishing collaborations and writing up research findings, respectively, and are not elaborated upon further here. We describe our work within each of the three stages of the core analysis phase: discovery, design, and implementation (Fig. 2). We define domain specialists in this case as the TB stakeholders \u2014 clinicians, laboratorians, and epidemiologists \u2014 who regularly use reports from the reference mycobacteriology laboratory in their work.\n\r\n\n\n\n\n\n\n\n\n\n\n Fig. 2 Our human-centered design approach. The core analysis phase of the design study methodology consists of discovery, design, and implementation stages. Using this methodological backbone, we collected and analyzed data using mixed-methods study designs in the discovery and design stages, which informed the final TB WGS clinical report design.\n\n\n\nOur research was reviewed and approved by the University of British Columbia\u2019s Behavioural Research Ethics Board (H10-03336). All data were collected through secure means approved by the university and were de-identified for analysis and sharing. Anonymized quantitative results from each of the surveys and the analysis code are available at https:\/\/github.com\/amcrisan\/TBReportRedesign and in the Supplemental Information 1. We also provide the full text of our survey instruments in the Supplemental Information 1.\n\nDiscovery stage \nIn the discovery stage, we first gathered qualitative data through expert consults to identify the data types used in TB diagnosis, treatment, and surveillance tasks. We then gathered quantitative data through an online survey to more robustly link particular data types to specific tasks. This staged approach to data gathering is known as the exploratory sequential model.[23]\nOur expert consults took the form of semi-structured interviews with seven individuals recruited from the COMPASS-TB project team, the British Columbia Centre for Disease Control (BCCDC), and the British Columbia Public Health Laboratory (BCPHL). The interview questions served as prompts to structure the conversation, but experts were free to comment, at any depth, on the different aspects of TB diagnosis, treatment, and surveillance. We took notes during the consults in order to identify the tasks and data types common to TB workflows in the U.K. and Canada, as well as to determine which tasks could be supported by WGS data.\nInformed by the expert consults, we drafted a task and data questionnaire (text in Supplemental Information 1) to survey data types used across the TB workflow (see results for a list of data types), the role for WGS data in diagnosis, treatment, and surveillance tasks; and participants\u2019 confidence in interpreting different data types. The questionnaire primarily used multiple choice and true\/false type questions, but it also included the optional entry of freeform text. The questionnaire was deployed online using the FluidSurveys platform, and participants were recruited using snowball and convenience sampling for a one-week period in July 2016. For questions pertaining to diagnostic and treatment tasks, we gathered information only from participants self-identifying as clinicians; for the remaining sections of the survey, all participants were prompted to answer each question.\nOnly completed questionnaires were used for analysis. For questions pertaining to participants\u2019 background, their perception of WGS utility, and their confidence interpreting WGS data, we report primarily descriptive statistics. To link TB workflow tasks to specific data types, we presented participants with different task-based scenarios related to diagnosis, treatment, and surveillance and asked which data types they would use to complete the task. For each pair of data and task we assigned a consensus score depending on the proportion of participants who reported using a data type for a specific task: 0 for fewer than 25% of participants, 1 for 25\u201350%, 2 for 50\u201375%, and 3 if more than 75% of participants reported using a specific data type for the task at hand. Consensus scores for a data type were also summed across the different tasks. Freeform text, when it was provided, was considered only to add context to participant responses.\n\nDesign stage \nThe discovery stage revealed which data types to include in the redesigned report, while the goal of the design stage was to identify how it should be presented. We used a \"design sprint\" event to produce a series of prototype reports, which were then assessed through a second online questionnaire. This survey collected quantitative data on participants\u2019 preference for specific design elements, with participants also able to provide qualitative feedback on each element \u2014 a type of embedded mixed methods study design.[23]\nThe design sprint was an interactive design session involving members of the University of British Columbia\u2019s Information Visualization research group, in which teams created alternative designs to report WGS data for the diagnosis, treatment, and surveillance tasks. Teams developed paper prototypes[24][25] of a complete WGS TB report and, at the completion of the event, presented their prototypes and the rationale for each design choice. The paper prototypes were then digitally mocked up, both as complete reports and as individual elements (see the results in Figs. 3 and 4); these digital prototypes were standardized with respect to text, fonts, and sample data where appropriate and used as the basis of the second online survey.\n\r\n\n\n\n\n\n\n\n\n\n\n Fig. 3 Digital mockups of complete report prototypes generated during the design sprint. (A) Prototype report 1, (B) prototype report 2, (C) prototype report 3, (D) prototype report 4\n\n\n\n\n\n\n\n\n\n\n\n\n Fig. 4 Isolated design elements. The original report element, highlighted in red, is broken down into isolated design elements, each of which was tested independently in the report design survey. In this example, the original resistance summary yields five different alternative wordings and design elements.\n\n\n\nIn the design choice questionnaire (text in Supplemental Information 1), we evaluated participants\u2019 preferences for individual design elements, comparing the options generated during the design sprint as well as the initial COMPASS-TB report design, which we hereafter refer to as the control design. As with the first survey, the questionnaire used FluidSurveys, with participants recruited using snowball and convenience sampling. Individuals who had previously participated in the data and task questionnaire were also invited to participate. The survey was open for one month beginning September 10, 2016 and was reopened to recruit additional participants for one month beginning January 5, 2017, as part of the registration for a TB WGS conference hosted by PHE. Only completed surveys were analyzed.\nWe used single-selection multiple-choice, Likert scale, and ranking questions to assess participant preferences. For multiple-choice and Likert scale questions, we calculated the number of participants that selected each option and report the sum. For questions that required participants to rank options we calculated a rescaled rank score as follows:\n\n \n \n \n \n r\n e\n s\n c\n a\n l\n e\n d\n \n \n \n r\n a\n n\n k\n \n \n \n (\n \n D\n \n i\n \n \n )\n \n =\n 1\n −\n \n \n \n \n P\n \n −\n 1\n \n \n \n ∑\n \n p\n =\n 1\n \n \n P\n \n \n \n R\n \n p\n \n \n −\n 1\n \n \n N\n −\n 1\n \n \n \n \n \n {\\displaystyle {rescaled}\\quad {rank}\\quad \\left(D_{i}\\right)=1-{\\frac {P^{-1}\\sum _{p=1}^{P}R_{p}-1}{N-1}}}\n \n \nwhere for each design choice (Di), i = {1\u2026<i>N<\/i>} where N is the total number of design choices, R = {1\u2026<i>N<\/i>} is a raw rank (rank selected by a participant in the study), and P = {1\u2026<i>P<\/i>} is the total number of participants. In our study, 1 was the highest rank (most preferred) and N was the lowest rank (least preferred) option. As an example, if a design, D1, is always ranked 1 (greatest preference by everyone), the sum of those ranks is P, resulting in a numerator of 0 and a rescaled rank score of 1. Alternatively, if a design, D2, is always ranked last (N), the sum of those ranks will be P\u2217N, and a rescaled rank score of 0. Thus, the rescaled rank score ranges from 1 (consistently ranked as first) to 0 (consistently ranked last). This transformation from raw to rescaled ranks allows us to compare across questions with different numbers of options, but is predicated on each design alternative having a rank, which is why this approach was not extended to multiple choice questions.\nTo contextualize rescaled rank scores, we randomly permuted participants\u2019 scores 1,000 times and pooled the rescaled rank scores across these iterations to obtain an average score (intuitively and empirically this is 0.5 for the rank questions and \n \n \n \n \n \n 1\n N\n \n \n \n \n {\\displaystyle {\\frac {1}{N}}}\n \n for multiple choice questions) and standard deviation. For each design choice, we plotted its actual rescaled rank score against the distribution of random permutations, highlighting whether the score was within \u00b1 1, 2, or 3 standard deviations from the random permutation mean score. The closer a score was to the mean, the more probable that the participants\u2019 preferences were no better than random. We also calculated bootstrapped 95% confidence intervals for both rank and multiple choice type questions by re-sampling participants, with replacement, over 1,000 iterations.\n\nImplementation stage \nBy combining the results of the design choice questionnaire with medical test reporting requirements from the ISO 15189:2012 standards, we developed a final template for reporting TB WGS data in the clinical laboratory. We used deviation from a random score, described in the methods, as an indicator of preference, with strong preferences being three or more deviations from the random score. When there was no strongly preferred element, we explain our design choice in the design walkthrough (Supplemental Information 1). We also considered consensus between clinicians and non-clinicians and defaulted to clinician preferences in instances of disagreement as they are the primary consumers of this report. The final prototype is implemented in LaTeX and is available online as a template accessible at http:\/\/www.cs.ubc.ca\/labs\/imager\/tr\/2017\/MicroReportDesign\/.\n\nResults \nExpert consults, the task and data questionnaire, and the design choice questionnaires recruited a total of 78 participants across different roles in TB management and control (Table 1).\n\n\n\n\n\n\n\nTable 1. Total study participants across different stages of the design study methodology. Notes: *National Reference Laboratory Service\n\n\n\n\nExpert consults\n\nTask and data questionnaire\n\nDesign choice questionnaire\n\n\nStage\n\nDiscovery\n\nDesign\n\n\nData collected\n\nQualitative\n\nQuantitative\n\nQualitative and Quantitative\n\n\nParticipants\n\nN (% survey total)\n\nN (% survey total)\n\nN (% survey total)\n\n\nClinician\n\n2 (29%)\n\n7 (40%)\n\n13 (25%)\n\n\nNurse\n\n1 (14%)\n\n3 (18%)\n\n5 (9%)\n\n\nLaboratory\n\n2 (29%)\n\n3 (18%)\n\n8 (15%)\n\n\nResearch\n\n0 (0%)\n\n1 (6%)\n\n8 (15%)\n\n\nSurveillance\n\n1 (14%)\n\n3 (18%)\n\n8 (15%)\n\n\nOther*\n\n1 (14%)\n\n0 (0%)\n\n12 (21%)\n\n\nTOTAL\n\n7 (100%)\n\n17 (100%)\n\n54 (100%)\n\n\n\nExperts emphasized prioritizing information and revealed constraints \nThe objective of our expert consults was to understand how reports from the reference mycobacteriology laboratory are currently used in the day-to-day workflows of various TB stakeholders (including clinicians, laboratorians, epidemiologists, and researchers) and what data types are currently used to inform those tasks. Tasks and data types enumerated in the interviews were used to populate downstream quantitative questionnaires; however, the interviews also provided insights into how stakeholders viewed the role of genomics in a clinical laboratory.\nAmongst the procedural insights, stakeholders frequently reported that the biggest benefit of WGS over standard mycobacteriology laboratory protocols was to improve testing turnaround times and gather all test results into a single document, rather than having multiple lab reports arriving over weeks to months. Several experts emphasized that these benefits can only be realized if the WGS analytical pipeline has been clinically validated. Although our study team included a clinician and a TB researcher, two surprising procedural insights emerged from the consultations. First, multiple experts from a clinical background emphasized that this audience has extremely limited time to digest the information found on a clinical report. In describing their interaction with a laboratory report, one participant noted that \u201c10 seconds [to review content] is likely, one minute is luxurious\u201d while others described variations on the theme of wanting bottom-line, actionable information as quickly as possible. This insight profoundly shaped downstream decisions around how much data to include on a redesigned report and how to arrange it over the report to permit both a quick glance and a deeper dive. Second, experts indicated that laboratory reports were delivered using a variety of formats, including PDFs appended to electronic health records, faxes, or physical mail. This created design constraints at the outset of the project; our redesigned report needed to be legible no matter the medium, ruling out online interactivity, and needed to be black and white.\n\nExperts vary in their perception of different data types \nAt the data level, we observed that the experts had differing perceptions of data types and desired level of detail between clinicians and non-clinicians, perhaps reflecting the clinicians\u2019 procedural need for rapid interpretation. Clinicians emphasized the importance of presenting actionable results clearly and omitting those that were not clinically relevant for them. For example, when presented with the sequence quality data on the current COMPASS-TB report (Fig. 1) \u2014 metrics reflecting the quality of the sequencing run and downstream bioinformatics analysis \u2014 interviewees did not expect the lab to release poor quality data, given the presence of strict quality control mechanisms. ISO 15189:2012 standards require some degree of reporting around the measurement procedure and results, but this insight suggested such data might best be placed later in the report in a simplified format, or described in the report comments. Similarly, experts were also divided on the interpretability and utility of the phylogenetic tree in the epidemiological relatedness section of the current COMPASS-TB report, with clinicians noting that the case belonging to an epidemiological cluster would not impact their use of the genomic test results.\nExperts also disagreed about the level of detail needed for WGS data, and this appeared to depend upon on whether the expert was a clinician as well as their prior experience with WGS through the COMPASS-TB project. For example, one expert indicated that \u201cclinicians are wanting to know which mutations conferred resistance\u201d, while another noted that they \u201cdon\u2019t use these [mutations] right now routinely, so it\u2019s not that relevant\u201d. When asked to comment on the resistance summary table in the current COMPASS-TB report (Fig. 1), clinicians were concerned about the use of abbreviations for both drug names and susceptibility status leading to misinterpretation, and many were uncertain how to use the detailed mutation information in the resistotype table.\n\nWGS data is vital, but some lack confidence in its interpretation \nThe expert consults provided a detailed overview of the tasks and data associated with TB care, allowing us to create a draft workflow outlining the TB diagnosis, treatment, and surveillance tasks coupled to the supporting data sources and data types (Fig. S1). This workflow was used to design the task and data questionnaire.\nOf the 17 participants responding in full to the task and data questionnaire (Table 1), most were from the United Kingdom (88%) and most reported professional experience and formal education in infectious diseases and epidemiology (Table S1). Participants were less likely to report education at the masters or doctoral level in microbial genomics, biochemistry, or bioinformatics (Table S1). Fewer than half (47.1%) of participants had participated in TB WGS projects, but all (100%) participants were enthusiastic about the role of microbial genomics in infectious disease diagnosis, both today (47.1%) and in the near future, pending clinical validation (52.9%).\nWhen queried about their potential future use of molecular data \u2014 whether WGS, genotyping, or other \u2014 participants indicated they foresaw themselves consulting, often or all the time, data on resistance-conferring mutations (82.3% of participants), MIRU-VNTR patterns (88.2%), epidemiological cluster membership (76.5%), single nucleotide polymorphism\/variant distances from other isolates (64.7%), and WGS quality metrics (58.8%) (Table S2). However, of the 14 different data types queried, the majority of participants only felt confident in interpreting four (MIRU-VNTR, drug susceptibility from culture, drug susceptibility from PCR or LPA, genomic clusters); most participants only felt somewhat confident, or not confident at all, interpreting the other data types (Table S3).\nMoving from confidence in their own interpretation of laboratory data types to confidence in the utility of WGS data in general, the majority of participants were confident that information contained within the TB genome can be used to correctly perform organism speciation (76.5%), assign a patient to existing clusters (70.0%), rule out transmission events (64.7%), and to a lesser extent were confident TB WGS could be used to identify epidemiologically-related patients (58.8%) and predict drug susceptibility (52.9%) (Table S4). The majority of participants thought genomic data may be able to inform clinicians of appropriate treatment regimens (100%) and identify transmission events (94.1%); however, participants showed mixed consensus toward whether genomic data could be used to monitor treatment progress for TB (47.2%) or diagnose active TB (52.9%).\n\nRespondent consensus suggests a role for WGS in diagnosis and treatment tasks \nTo examine which data types were being used to support diagnosis, treatment, and surveillance tasks in the workflow, we assigned a numerical score reflecting respondent consensus around each data type-task pair (Fig. 5). We found greater consensus around the data types that participants would use in diagnosis and treatment tasks, but little consensus around the data they would use for surveillance tasks, contrasting with participants\u2019 previously stated support for using WGS or other genotyping data for understanding TB epidemiology. Overall, the most frequently used data types included administrative data (patient ID, sample type, collection site, collection date) and results from current laboratory tests (solid or liquid culture, smear status, and speciation), which together were used primarily for diagnosis and treatment. Prior test results from a patient were deemed important; however, the earlier expert consults indicated that such data was difficult to obtain and unlikely to be included in future reports.\n\r\n\n\n\n\n\n\n\n\n\n\n Fig. 5 Extent of consensus between TB workflow tasks and available TB data. Results are redundantly encoded using color and a numerical value to represent the degree of consensus between participants around using a specific data type to carry out a specific task.\n\n\n\nWe also queried participants\u2019 perceptions of barriers impacting their workflow, with the majority of participants (83.3%) reporting issues with both the timeliness of receiving TB data from the reference laboratory and the distribution of test results across multiple documents (Table S5), a finding that corroborated the procedural insights from the expert consults.\n\nPrototyping via a design sprint produces a range of design alternatives \nEquipped with an understanding of how WGS data might be used in the various TB workflow tasks, we embarked on the design stage of the design study methodology. A design sprint event involving study team members and information visualization experts resulted in four prototype report designs (Fig. 3) and various isolated design elements (Fig. 4). Although each prototype used different design elements for the required data types, when the prototypes were compared at the end of the event, common themes emerged. These included presenting data in an order informed by the workflow\u2014data related to diagnosis, treatment, then surveillance; placing actionable, high-level data on the front page, with additional details on the over page; and using both an overall summary statement at the beginning of the report as well as brief summary statements at the beginning of each section.\nTo drill down and determine which design elements best communicate the underlying data, we isolated individual design elements (Fig. 4) and classified them as wording choices, e.g., which heading to use for a given section of the report\u2014or design choices, such as layout, the use of emphasis, and the use of graphics (Table S6).\n\nThe design choice questionnaire quantifies participant preferences for specific design elements \nWe next developed an online survey, the design choice questionnaire, to assess stakeholders\u2019 preferences for both specific design elements and overall report prototypes. The distribution of public health roles among survey participants is presented in Table 1; all but 11 participants (20%) actively worked with TB data. Participants were employed by academic institutions (35.2%), hospitals (24.1%), and public health organizations (33.3%), with only 7.4% of participants being employed in some other sector. The majority of participants were from the U.K. (59.2%), while 11.1% were from Canada; the remaining 29.7% were drawn from the United States (6.5%), Europe (14.8%), Brazil (2.8%), India (2.8%), and Gambia (2.8%).\nWe first examined participants\u2019 preference for specific wording and design elements (Figs. 6A and 6B), comparing elements arising from the prototypes to those used in the existing COMPASS-TB report, which acted as a control. Notably, of the 15 wording and design elements queried, in only two cases was the control design preferred over a design arising from one of the prototypes (note that one query did not compare to a control). Furthermore, in eight out of 15 queries (Q6, Q8, Q9, Q10, Q12, Q17, Q5, Q18) participants showed strong preferences, wherein the top preference was +3 or more standard deviations from the mean for both clinicians and non-clinicians. Figure S1 provides a version of Fig. 6 with confidence intervals and indicates concordance between strong preferences and non-overlapping confidence intervals.\n\r\n\n\n\n\n\n\n\n\n\n\n Fig. 6 Design Choice Questionnaire results. Responses are grouped according to question type: wording (A), design choices (B), and full reports (C), and partitioned into clinician participants (squares) and non-clinician participants (circles). Responses are colored according to whether they are the control design from the original report (white) or an alternative design devised in the design sprint (black). Lines connect options between clinician and non-clinicians preferences, with thicker crossing lines showing discordance between the two groups and vertical lines showing concordance in preferences. Rescaled rank scores are shown against a reference of random permutations (see Methods), with scores closer to 1 indicating the most preferred response. Specific questions are indicated with Q; the questions as presented to the participants are shown in Table S6.\n\n\n\nThe findings from the analysis of wording elements (Fig. 6A) showed that participants preferred complete terms to abbreviations, such as writing out \u201cisoniazid\u201d as opposed to \u201cINH\u201d or \u201cH\u201d, or \u201cresistant\u201d as opposed to \u201cR\u201d, and that both clinicians and non-clinicians were in agreement over the preferred vocabulary for section headings. Interestingly, wording questions related to the treatment task yielded the widest range of rankings.\nClear preferences were also observed for information design elements, again largely concordant between clinicians and non-clinicians (Fig. 6B). Participants preferred elements that drew attention to specific data such as summary statements, shading, and tick boxes, and many participants preferred that sections be prioritized, with less important details relegated to the second page of the report. However, there was less consensus around how much detail to include and where. The majority of participants indicated that genomic data pertaining to resistance-conferring mutations should be included (Fig. 6B; Q11), but were divided as which data should be included and where. Most (85%) wanted to know the gene harboring the resistance mutation (i.e., katG; inhA), but only half wanted details of the specific mutation (50% wanted the amino acid substitution, 46% wanted to know the nucleotide-level change). We did not test any design elements displaying the strength of the association between the mutation and the resistance phenotype; however, we will add this to a future version of the report pending receipt of the final mutation catalog from the ReSeqTB Consortium.\nInterestingly, while both clinicians and non-clinicians reported similar rankings for most design elements, one element showed an unusual distribution of scores: the visualization for showing genomic relatedness and membership in a cluster. While both groups of participants preferred a phylogenetic tree accompanied by a summary table, which is the current COMPASS-TB control design, the other four options appeared to be ranked randomly, with rescaled rank score close to 0.5, suggesting that none of the alternative options were particularly good.\nWe also had participants rank their preferences for the four prototype designs (Fig. 6C). While all participants ranked Prototype D as their least preferred choice, many citing that the images used were too distracting, clinicians and non-clinicians varied in their ranking of the other three options, with clinicians preferring option A and non-clinicians preferring B. However, qualitative feedback collected for this question revealed that participants found comparing individual elements easier than comparing full reports.\n\nQualitative data affords additional insights into report design \nThe qualitative responses in the design choice questionnaire raised important points that would otherwise not have been captured by quantitative data alone. For example, the importance of presenting drug susceptibility data clearly emerged from the qualitative responses. Participants indicated that \u201cthe report must call attention [to] drug resistance\u201d and expressed concern that the abbreviation of drug names and\/or predicted resistance phenotype could lead to misinterpretation and pose risks to patient safety, stating that \u201cnot all clinicians [are] likely to recognize the abbreviations\u201d and \u201c[using the full name] reduces the risk of errors, especially if new to TB\u201d. When choosing how to emphasize predicted drug susceptibility information (shading, bolding, alert glyphs, or no emphasis), some participants suggested shading draws the quickest attention to [resistance]\u201d and that \u201cwith presbyopia, resistance can be easily missed and therefore shading affords greater patient safety\u201d, but other participants indicated drug susceptibility, rather than resistance, should be emphasized: \u201cnot sure that resistant should be shaded\u2014better to shade sensitive drugs in my view\u201d and \u201cit would be better to highlight what is working instead of highlight what is not working.\u201d We opted to highlight resistance given the low incidence of drug-resistant TB in the U.K. and Canada, which were the primary application contexts. Some reported concerns as to whether such emphasis was possible with current electronic health records, including \u201c[bolding or shading] may not transfer correctly\u201d and \u201cshaded [text] won\u2019t photocopy well,\u201d which prompted us to test both printing and photocopying of the resulting report.\nThe issue of clinicians having little time to interact with the report, raised in both the expert consults and the task and data questionnaire, also became apparent in the qualitative responses to the design choice questionnaire, such as \u201cthe best likelihood of success will [come] from the ability to draw attention to someone scanning the document quickly.\u201d However, participants\u2019 perceptions of which design choices best promoted rapid synthesis varied. Some preferred summaries in the form of check boxes \u2014 \u201c[a] tick box is the most straightforward way to summarize it. Reading a summary sentence will probably take longer\u201d and \u201cthe check boxes provide an at-a-glance result\u201d \u2014 while others preferred additional commentary: \u201cinterpretation is important; but tick boxes alone lack the necessary nuance required for interpretation\u201d and that \u201ctick boxes may cause confusion when clinicians read XDR without realizing that option is not selected. Ideal to add a comment about resistance\u201d. To address this concern we added a \u201cNo drug resistance predicted\u201d option to the check-boxes (absent from the survey design options), and included shading elements to emphasize the drug susceptibility result.\nThe qualitative responses to Q17 (Fig. 6B) provided further insight into the uncertainty around how best to represent genomic relatedness suggestive of an epidemiological relatedness. Some participants felt that data related to surveillance tasks should not appear in a report that is also meant for clinicians, either because it was not relevant to this audience \u2014 \u201c[this data] should not appear in the report. It should only be given to field epi and researchers. Overloading the clinical report would be deteriorating\u201d and \u201cnot useful for a clinician\u201d \u2014 or because they were uncertain about its interpretation: \u201ccluster detection would be fine for those who already know what a cluster is\u201d and \u201cmy patient\u2019s isolate is 6 SNPs from someone diagnosed 3 years ago. What is the clinical action?\u201d\nOf the design choices for cluster detection, several participants articulated that many of the options, including the control, \u201c[included] too much information and [were] unnecessary for routine diagnosis\/treatment\u201d. However, others felt that the options did not provide sufficient detail and offered alternatives, such as \u201cif you can combine the phylogenetic tree with some kind of graph showing temporal spread that would be perfect. Adding geographical data would be a really helpful bonus too\u201d. This is an area of reporting that requires further investigation and was not fully resolved in our study.\nFinally, participants were candid about those design options that did not work well. For example, of the report design with many graphics (Fig. 6A, option D), participants indicated it was \u201cdistracting; looks like a set of roadworks rather than a microbiology report\u201d and that it was important to \u201ckeep it simple\u201d. Their feedback also revealed when our phrasing on the survey instruments was unclear.\n\nDeveloping a final report template \nThere are no prescriptive guidelines around integrating our quantitative data, qualitative data, and ISO 15189:2012 reporting requirements; thus, we have attempted to be as transparent and empiric as possible in justifying our final design (Fig. 7). A more thorough walkthrough is presented in the Fig. S1 and here we highlight selected choices. The final prototype is implemented in LaTeX and is available online as a template accessible at http:\/\/www.cs.ubc.ca\/labs\/imager\/tr\/2017\/MicroReportDesign\/.\n\r\n\n\n\n\n\n\n\n\n\n\n Fig. 7 Original (A) and revised reports (B). The revised report uses empirical evidence gathered through multiple stages of a human centered design process. Note that the image in the upper corner of the revised report is a placeholder for an organizational logo.\n\n\n\nWe first incorporated ISO 15189:2012 requirements (see Fig. S1) into the final report template and then turned to the preferences expressed in the design choice questionnaire. Overall, information was structured to mirror the TB workflow: diagnosis, treatment, then surveillance. We chose to limit bolding to relevant information and used shading to highlight important and actionable clinical information, under the rationale that appropriate use of emphasis could facilitate an accurate and quick reading of the report, with detailed information present but de-emphasized.\nIn two instances, our design decisions deviated from participant preferences: we opted to use one column instead of two, and we presented detailed genomic resistance data on the first page of the report, rather than the second page. A single column was chosen as all of the information ranked as important by participants could be presented on a single page without the need to condense information into two columns. Because many of the resistotype details of the original report, such as mutation source and individual nucleotide changes (Fig. 1), were not included in the revised report, it was possible to present all of the participants\u2019 desired data in a single table on one page.\nA draft of the final design was presented to a new cohort of TB stakeholders at a September 2017 expert working group on standardized reporting of TB genomic resistance data. Through a group discussion, subtle changes to the report were made, including updating some of the language used (for example, replacing occurrences of the word \u201csensitive\u201d with \u201csusceptible\u201d), adding the lineage to the organism section, and adding additional fields to tables describing the sample, and the assay, such as what type of material was sequenced (pure culture, direct specimen) and what sequencing platform was used.\n\nDiscussion \nMicrobial genomics is playing an increasingly important role in public health microbiology, and its successful implementation in the clinic will rely not just on validation and accreditation of WGS-based tests, but also in how effective the resulting reports are to stakeholders, including clinicians. Using design study methodology, we developed a two-page report template to communicate WGS-derived test results related to TB diagnosis, drug susceptibility testing, and clustering.\nTo our knowledge, this project is the first formal inquiry into human-centered design for microbial genomics reporting. We argue that the application of human-centered design methodologies allowed us to improve not only the visual aesthetics of the final report, but also its functionality, by carefully coupling stakeholder tasks, data, and constraints to techniques from information and graphic design. Giving the original report, a \u201cgraphic design facelift\u201d would not have improved the functionality, as some of the information in the original report was found to be unnecessary, presented in a way that could lead to misinterpretation, or did not take into account stakeholder constraints. For example, interviews and surveys revealed procedural and data constraints our study team had not anticipated, including the limited time available for clinicians to read laboratory reports and the need for simple, black-and-white formatting amenable to media ranging from electronic delivery to fax; these findings were critical to shaping the downstream design process. Furthermore, in nearly every case, study participants preferred our alternative design elements, informed by empirical findings in the discovery stage, over the control elements derived from the original report. Our approach also suggested that some participants are not confident in their ability to interpret certain types of genomic data. As WGS moves towards routine clinical use, it is clear that successful implementation of genomic assays will also require complementary education and training opportunities for those individuals regularly interacting with WGS-derived data.\nAlthough human-centered information visualization design methodologies are commonly used in software development, it could be asked whether they are warranted in a report design project. One advantage of tackling the simpler problem of report design is that it allows us to demonstrate design study methodology in action and link evidence to design decisions more clearly than with a software product. We also collected data with the intention of applying it to the development and evaluation of more complex reporting and data visualization software that we plan to create. Similarly, others can use our approach or our data to inform the design of simple or complex applications elsewhere in pathogen genomics and bioinformatics.\nThe exploratory nature of this project brings with it certain limitations. First, our participants were identified through convenience and snowball sampling within the authors\u2019 networks and thus are likely to be more experienced with the clinical application of microbial genomics. While this is appropriate for the context of our collaboration, in which our goal is redesigning a report for use by the COMPASS-TB team and collaborating laboratories, it does limit our ability to generalize the findings to other settings. WGS is only used routinely in a small number of laboratories, and even if its reach were larger, these may be settings where English is not the first language used in reporting clinical results, or where written text is read in different ways, both of which would affect our design choices. Second, we did not have a priori knowledge of the effect sizes (i.e., extent of preferential difference for each type of question) in the design choice questionnaire, making sample size calculations challenging. Had a priori effect sizes been available, the study could be powered, for example, for the smallest or average effect size. To avoid mischaracterizing our results, we have relied on primarily descriptive statistics, without tests for statistical significance, and assert that our findings are best interpreted as first steps toward a better understanding how information and visualization design can play a role in reporting pathogen WGS data. However, when confidence intervals were calculated for the results of the design choice questionnaire, we observed that non-overlapping confidence intervals separated user preferences as well as the deviation from a random score metric that we primarily used in our analysis. We argue the latter is a useful measure for exploratory studies without clear a priori knowledge of effect sizes for proper sample size calculations. Finally, we did not undertake a head-to-head experimental comparison between the original report design and the revised design. While this comparison had been planned at the outset of our project, the results of the design choice questionnaire showed such a clear preference for the alternative designs when comparing isolated components that we concluded there was no need for such a final test as it would yield little new evidence.\nFor researchers wishing to undertake a similar human-centered design approach, we have summarized our primary findings into three experimental guidelines and five design guidelines. These guidelines arose from our experience throughout this report redesign process but are intended to apply generally to the process of designing visualizations for microbial genomic data or other human health-related information.\nThe three experimental guidelines reflect the areas of the design methodology that we found to be particularly important in our data collection and analysis as well as the final report design process. First, design around tasks. It is tempting to simply ask stakeholders what they want to see in a final design, but many of them will not be able to create an effective end product because design is not their principal area of expertise. However, stakeholders know very well what they do on a daily basis and can indicate data that are relevant to those specific tasks and can indicate in which areas they require more support. The role of the designer is to marry those tasks, clinical workflows, and constraints into design alternatives. Depending on the tasks and context, many design alternatives might be possible, making use of color, more complex visualizations, or interactivity. In other situations, such as the one presented here, design constraints limit the range of prototypes that can be generated. \nSecond, compare isolated components, and not just whole systems. Here we use \"system\" to mean either a simple report or a more complex software system. Comparing whole systems can overload an individual\u2019s working memory, meaning they may rely on heuristics such as preferences around style or distracting elements, when assessing and comparing full systems.[26] Presenting isolated design elements and controlling for non-tested factors (i.e., font, text) can reduce the burden on working memory and isolate the effect of design alternatives. \nFinally, compare against a control whenever possible. If a prior report or system exists, or if there are commonly agreed upon conventions in the literature or field, it is useful to compare novel designs against an existing one. More generally, comparison of multiple alternatives is the most critical defense against defaulting to ad hoc designs and the most important step of our human-centered design methodology.\nOur five design guidelines reflect techniques from information visualization and graphic design that we used in an attempt to improve the readability of the report and balance different stakeholder information needs. First, structure information such that it mimics a stakeholder\u2019s workflow. In this case, the report prioritizes a clinical workflow, and this workflow is reflected in the report\u2019s design through the use of gestalt principles[27], treating the whole as greater than the sum of its parts. Specifically, we group related data and order information hierarchically so that the document is read according to the clinical narrative we established in the discovery phase. Second, use emphasis carefully. Here, bolding, text size, and shading were reserved to highlight important data and were not applied to aesthetic aspects of the report design. Third, present dense information in a careful and structured manner. Stakeholders should not have to search for relevant information, a cognitively expensive task[28] that can result in information loss.[29] Through the combination of gestalt, visual hierarchy, and careful use of emphasis, it is possible to present a lot of information by creating two layers: a higher-level \u201cquick glance\u201d layer and a more detailed lower layer. The quick glance layer should contain the relevant and clinically actionable information and should be visually salient (i.e., \u201cpop-out\u201d), while the detailed layer should be less visually salient and contain additional information that some, but not all, stakeholders may wish to have (based on their tasks and data needs). Fourth, use words precisely. Specific terminology may not be uniformly understood or consistently interpreted by stakeholders, particularly when the designer and the stakeholders come from different domains, or even when individuals in the same domain have markedly different daily workflows, such as bioinformaticians and clinicians. Finally, if using images, do so judiciously. Images can be distracting when they do not convey actionable information relevant to the stakeholder.\n\nConclusions \nWe applied human-centered design methodologies to redesign a clinical report for a reference microbiology laboratory, but the techniques we used \u2014 drawn from more complex applications in information visualization and human\u2013computer interaction \u2014 can be used in other scenarios, including the development of more complex data dashboards, data visualization or other bioinformatics tools. By introducing these techniques to the microbial genomics, bioinformatics, and genomic epidemiology communities, we hope to inspire their further use of evidence-based, human-centric design.\n\nSupplemental information \nSupplemental materials found here (PDF).\n\nAdditional information and declarations \nCompeting interests \nThe authors declare there are no competing interests.\n\nAuthor contributions \nAnamaria Crisan and Geoffrey McKee conceived and designed the experiments, performed the experiments, analyzed the data, contributed reagents\/materials\/analysis tools, wrote the paper, prepared figures and\/or tables, reviewed drafts of the paper. Tamara Munzner conceived and designed the experiments, analyzed the data, wrote the paper, reviewed drafts of the paper. Jennifer L. Gardy conceived and designed the experiments, performed the experiments, analyzed the data, wrote the paper, reviewed drafts of the paper.\n\nHuman ethics \nOur research was reviewed and approved by the University of British Columbia\u2019s Behavioural Research Ethics Board (H10-03336).\n\nData aAvailability \nData + Analysis Repository: https:\/\/github.com\/amcrisan\/TBReportRedesign\nReport Example Repository: https:\/\/github.com\/amcrisan\/TB-WGS-MicroReport\n\nFunding \nAnamaria Crisan is supported by the Vanier Scholars Program, Dr. Jennifer L. Gardy is supported by the Canada Research Chairs Program and the Michael Smith Foundation for Health Research Scholar Award Program, Dr. Tamara Munzner is supported by the Natural Sciences and Engineering Research Council of Canada Discovery Program RGPIN-2014-06309. The work described here was funded by the British Columbia Centre for Disease Control Foundation for Population and Public Health, as well as Genome BC, through grant G01SMA: Sharing Mycobacterial Analytic Capacity. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.\n\nAcknowledgements \nWe would like to acknowledge and thank all study participants who took the time to respond to the surveys and provided excellent and valuable insights for our work. We would also like to thank Ana Gibertoni-Cruz, Grace Smith, and Tim Walker for their contributions in the early stages of this project and in recruiting participants, and the members of the information visualization group at the University of British Columbia who participated in our design sprint: Kimberly Dextras-Romagnino, Dylan Dong, Georges Hattab, and Zipeng Liu.\n\nReferences \n\n\n\u2191 Fukui, Y.; Aoki, K.; Okuma, S. et al.. \"Metagenomic analysis for detecting pathogens in culture-negative infective endocarditis\". Journal of Infection and Chemotherapy 21 (12): 882\u20134. doi:10.1016\/j.jiac.2015.08.007. PMID 26360016.   \n\n\u2191 Loman, N.J.; Constantinidou, C.; Christner, M. et al.. \"A culture-independent sequence-based metagenomics approach to the investigation of an outbreak of Shiga-toxigenic Escherichia coli O104:H4\". JAMA 309 (14): 1502-10. doi:10.1001\/jama.2013.3231. PMID 23571589.   \n\n\u2191 Bradley, P.; Gordon, N.C.; Walker, T.M. et al.. \"Rapid antibiotic-resistance predictions from genome sequence data for Staphylococcus aureus and Mycobacterium tuberculosis\". Nature Communications 6: 10063. doi:10.1038\/ncomms10063. PMC PMC4703848. PMID 26686880. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4703848 .   \n\n\u2191 4.0 4.1 Pankhurst, L.J.; Del Ojo Elias, C.; Votintseva, A.A. et al.. \"Rapid, comprehensive, and affordable mycobacterial diagnosis with whole-genome sequencing: A prospective study\". The Lancet Respiratory Medicine 4 (1): 49-58. doi:10.1016\/S2213-2600(15)00466-X. PMC PMC4698465. PMID 26669893. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4698465 .   \n\n\u2191 Walker, T.M.; Kohl, T.A.; Omar, S.V. et al.. \"Whole-genome sequencing for prediction of Mycobacterium tuberculosis drug susceptibility and resistance: A retrospective cohort study\". The Lancet Infectious Diseases 15 (10): 1193\u20131202. doi:10.1016\/S1473-3099(15)00062-6. PMC PMC4579482. PMID 26116186. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4579482 .   \n\n\u2191 Nikolayevskyy, V.; Kranzer, K.; Niemann, S. et al.. \"Whole genome sequencing of Mycobacterium tuberculosis for detection of recent transmission and tracing outbreaks: A systematic review\". Tuberculosis 98: 77-85. doi:10.1016\/j.tube.2016.02.009. PMID 27156621.   \n\n\u2191 Budowle, B.; Connell, N.D.; Bielecka-Oder, A. et al.. \"Validation of high throughput sequencing and microbial forensics applications\". Investigative Genetics 5: 9. doi:10.1186\/2041-2223-5-9. PMC PMC4123828. PMID 25101166. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4123828 .   \n\n\u2191 Gargis, A.S.; Kalman, L.; Lubin, I.M.. \"Assuring the Quality of Next-Generation Sequencing in Clinical Microbiology and Public Health Laboratories\". Journal of Clinical Microbiology 54 (12): 2857-2865. doi:10.1128\/JCM.00949-16. PMC PMC5121372. PMID 27510831. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5121372 .   \n\n\u2191 Driscoll, T.; Gabbard, J.L.; Mao, C. et al.. \"Integration and visualization of host-pathogen data related to infectious diseases\". Bioinformatics 27 (16): 2279-87. doi:10.1093\/bioinformatics\/btr391. PMC PMC3150046. PMID 21712250. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3150046 .   \n\n\u2191 Argim\u00f3n, S.; Abudahab, K.; Goater, R.J. et al.. \"Microreact: Visualizing and sharing data for genomic epidemiology and phylogeography\". Microbial Genomics 2 (11): e000093. doi:10.1099\/mgen.0.000093. PMC PMC5320705. PMID 28348833. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5320705 .   \n\n\u2191 Carroll, L.N.; Au, A.P.; Detwiler, L.T. et al.. \"Visualization and analytics tools for infectious disease epidemiology: A systematic review\". Journal of Biomedical Informatics 51: 287-98. doi:10.1016\/j.jbi.2014.04.006. PMC PMC5734643. PMID 24747356. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5734643 .   \n\n\u2191 Leslie, K.O.; Rosai, J.. \"Standardization of the surgical pathology report: Formats, templates, and synoptic reports\". Seminars in Diagnostic Pathology 11 (4): 253-7. PMID 7878300.   \n\n\u2191 Valenstein, P.N.. \"Formatting pathology reports: Applying four design principles to improve communication and patient safety\". Archives of Pathology and Laboratory Medicine 132 (1): 84-94. doi:0.1043\/1543-2165(2008)132[84:FPRAFD]2.0.CO;2. PMID 18181680.   \n\n\u2191 Renshaw, S.A.; Mena-Allauca, M.; Touriz, M. et al.. \"The impact of template format on the completeness of surgical pathology reports\". Archives of Pathology and Laboratory Medicine 138 (1): 121-4. doi:10.5858\/arpa.2012-0733-OA. PMID 24377820.   \n\n\u2191 Nygren, E.; Wyatt, J.C.; Wright, P.. \"Helping clinicians to find data and avoid delays\". Lancet 352 (9138): 1462-6. doi:10.1016\/S0140-6736(97)08307-4. PMID 9808009.   \n\n\u2191 16.0 16.1 Wright, P.; Jansen, C.; Wyatt, J.C.. \"How to limit clinical errors in interpretation of data\". Lancet 352 (9139): 1539-43. doi:10.1016\/S0140-6736(98)08308-1. PMID 9820319.   \n\n\u2191 Zipkin, D.A.; Umscheid, C.A.; Keating, N.L. et al.. \"Evidence-based risk communication: A systematic review\". Annals of Internal Medicine 161 (4): 270-80. doi:10.7326\/M14-0295. PMID 25133362.   \n\n\u2191 Ancker, J.S.; Senathirajah, Y.; Kukafka, R. et al.. \"Design features of graphs in health risk communication: A systematic review\". JAMIA 13 (6): 608\u201318. doi:10.1197\/jamia.M2115. PMC PMC1656964. PMID 16929039. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC1656964 .   \n\n\u2191 Hettinger, A.Z.; Roth, E.M.; Bisantz, A.M.. \"Cognitive engineering and health informatics: Applications and intersections\". Journal of Biomedical Informatics 67: 21\u201333. doi:10.1016\/j.jbi.2017.01.010. PMID 28126605.   \n\n\u2191 Horsky, J.; Schiff, G.D.; Johnston, D. et al.. \"Interface design principles for usable decision support: A targeted review of best practices for clinical prescribing interventions\". Journal of Biomedical Informatics 45 (6): 1202-16. doi:10.1016\/j.jbi.2012.09.002. PMID 22995208.   \n\n\u2191 \"Tuberculosis in England: 2016 Report\" (PDF). Public Health England. September 2016. https:\/\/www.gov.uk\/government\/uploads\/system\/uploads\/attachment_data\/file\/654294\/TB_Annual_Report_2016_GTW2309_errata_v1.2.pdf .   \n\n\u2191 22.0 22.1 Sedlmair, M.; Meyer, M.; Munzner, T. et al.. \"Design Study Methodology: Reflections from the Trenches and the Stacks\". IEEE Transactions on Visualization and Computer Graphics 18 (12): 2431-40. doi:10.1109\/TVCG.2012.213. PMID 26357151.   \n\n\u2191 23.0 23.1 Creswell, J.W. (2014). Research Design: Qualitative, Quantitative and Mixed Methods Approaches (4th ed.). Sage Publications. ISBN 9781452226101.   \n\n\u2191 Lloyd, D.; Dykes, J.. \"Human-centered approaches in geovisualization design: Investigating multiple methods through a long-term case study\". IEEE Transactions on Visualization and Computer Graphics 17 (12): 2498-507. doi:10.1109\/TVCG.2011.209. PMID 22034371.   \n\n\u2191 Vradenburg, K.; Mao, J.-Y.; Smith, P.W.; Carey, T.. \"A survey of user-centered design practice\". Proceedings of the SIGCHI Conference on Human Factors in Computing Systems 2002: 471-478. doi:10.1145\/503376.503460.   \n\n\u2191 Shah, A.K.; Oppenheimer, D.M. (2008). \"Heuristics made easy: An effort-reduction framework\". Psychological Bulletin 134 (2): 207\u201322. doi:10.1037\/0033-2909.134.2.207. PMID 18298269.   \n\n\u2191 Moore, P.; Fitz, C.. \"Using Gestalt theory to teach document design and graphics\". Technical Communication Quarterly 2 (4): 389\u2013410. doi:10.1080\/10572259309364549.   \n\n\u2191 Chang, T.-W.; Kinshuk; Chen, N.-S-.; Yu, P.-T. (2012). \"The effects of presentation method and information density on visual search ability and working memory load\". Computers and Education 58 (2): 721-731. doi:10.1016\/j.compedu.2011.09.022.   \n\n\u2191 Shneiderman, B. (1996). \"The eyes have it: A task by data type taxonomy for information visualizations\". Proceedings of the IEEE Symposium on Visual Languages, 1996 1996. doi:10.1109\/VL.1996.545307.   \n\n\nNotes \nThis presentation is faithful to the original, with only a few minor changes to presentation. In several cases the PubMed ID was missing and was added to make the reference more useful. The original article lists references alphabetically, but this version \u2014 by design \u2014 lists them in order of appearance.\n\n\n\n\n\n\nSource: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory\">https:\/\/www.limswiki.org\/index.php\/Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory<\/a>\n\t\t\t\t\tCategories: LIMSwiki journal articles (added in 2018)LIMSwiki journal articles (all)LIMSwiki journal articles on health informaticsLIMSwiki journal articles on reporting\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\n\t\t\n\t\t\tNavigation menu\n\t\t\t\t\t\n\t\t\tViews\n\n\t\t\t\n\t\t\t\t\n\t\t\t\tJournal\n\t\t\t\tDiscussion\n\t\t\t\tView source\n\t\t\t\tHistory\n\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\t\n\t\t\t\tPersonal tools\n\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\t\t\tLog in\n\t\t\t\t\t\t\t\t\t\t\t\t\tRequest account\n\t\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\t\n\t\tNavigation\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tMain page\n\t\t\t\t\t\t\t\t\t\t\tRecent changes\n\t\t\t\t\t\t\t\t\t\t\tRandom page\n\t\t\t\t\t\t\t\t\t\t\tHelp\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tSearch\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t \n\t\t\t\t\t\t\n\t\t\t\t\n\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tTools\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tWhat links here\n\t\t\t\t\t\t\t\t\t\t\tRelated changes\n\t\t\t\t\t\t\t\t\t\t\tSpecial pages\n\t\t\t\t\t\t\t\t\t\t\tPermanent link\n\t\t\t\t\t\t\t\t\t\t\tPage information\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\tPrint\/export\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tCreate a book\n\t\t\t\t\t\t\t\t\t\t\tDownload as PDF\n\t\t\t\t\t\t\t\t\t\t\tDownload as Plain text\n\t\t\t\t\t\t\t\t\t\t\tPrintable version\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\n\t\tSponsors\n\t\t\n\t\t\t \r\n\n\t\r\n\n\t\r\n\n\t\r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n\t\n\t\r\n\n\t\r\n\n\t\r\n\n\t\r\n\t\t\n\t\t\n\t\t\t\n\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t This page was last modified on 15 February 2018, at 20:55.\n\t\t\t\t\t\t\t\t\tThis page has been accessed 825 times.\n\t\t\t\t\t\t\t\t\tContent is available under a Creative Commons Attribution-ShareAlike 4.0 International License unless otherwise noted.\n\t\t\t\t\t\t\t\t\tPrivacy policy\n\t\t\t\t\t\t\t\t\tAbout LIMSWiki\n\t\t\t\t\t\t\t\t\tDisclaimers\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\t\n\n","428fb6eb50c74d741daa88c4061eeab2_html":"<body class=\"mediawiki ltr sitedir-ltr ns-206 ns-subject page-Journal_Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory skin-monobook action-view\">\n<div id=\"rdp-ebb-globalWrapper\">\n\t\t<div id=\"rdp-ebb-column-content\">\n\t\t\t<div id=\"rdp-ebb-content\" class=\"mw-body\" role=\"main\">\n\t\t\t\t<a id=\"rdp-ebb-top\"><\/a>\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t<h1 id=\"rdp-ebb-firstHeading\" class=\"firstHeading\" lang=\"en\">Journal:Evidence-based design and evaluation of a whole genome sequencing clinical report for the reference microbiology laboratory<\/h1>\n\t\t\t\t\n\t\t\t\t<div id=\"rdp-ebb-bodyContent\" class=\"mw-body-content\">\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\n\t\t\t\t\t<!-- start content -->\n\t\t\t\t\t<div id=\"rdp-ebb-mw-content-text\" lang=\"en\" dir=\"ltr\" class=\"mw-content-ltr\">\n\n\n\n<h2><span class=\"mw-headline\" id=\"Abstract\">Abstract<\/span><\/h2>\n<p><b>Background<\/b>: Microbial genome <a href=\"https:\/\/www.limswiki.org\/index.php\/Sequencing\" title=\"Sequencing\" class=\"mw-disambig wiki-link\" target=\"_blank\" data-key=\"e36167a9eb152ca16a0c4c4e6d13f323\">sequencing<\/a> is now being routinely used in many <a href=\"https:\/\/www.limswiki.org\/index.php\/Clinical_laboratory\" title=\"Clinical laboratory\" target=\"_blank\" class=\"wiki-link\" data-key=\"307bcdf1bdbcd1bb167cee435b7a5463\">clinical<\/a> and <a href=\"https:\/\/www.limswiki.org\/index.php\/Public_health_laboratory\" title=\"Public health laboratory\" target=\"_blank\" class=\"wiki-link\" data-key=\"34ffb658cb79bf322c65efaad95996f5\">public health laboratories<\/a>. Understanding how to <a href=\"https:\/\/www.limswiki.org\/index.php\/Reporting\" title=\"Reporting\" class=\"mw-disambig wiki-link\" target=\"_blank\" data-key=\"c83685a5b9154b511ee113bbffedb2e5\">report<\/a> complex genomic test results to stakeholders who may have varying familiarity with genomics \u2014 including clinicians, laboratorians, epidemiologists, and researchers \u2014 is critical to the successful and sustainable implementation of this new technology; however, there are no evidence-based guidelines for designing such a report in the pathogen genomics domain. Here, we describe an iterative, human-centered approach to creating a report template for communicating tuberculosis (TB) genomic test results.\n<\/p><p><b>Methods<\/b>: We used design study methodology \u2014 a human centered approach drawn from the <a href=\"https:\/\/www.limswiki.org\/index.php\/Data_visualization\" title=\"Data visualization\" target=\"_blank\" class=\"wiki-link\" data-key=\"4a3b86cba74bc7bb7471aa3fc2fcccc3\">information visualization<\/a> domain \u2014 to redesign an existing clinical report. We used expert consults and an online questionnaire to discover various stakeholders\u2019 needs around the types of data and tasks related to TB that they encounter in their daily workflow. We also evaluated their perceptions of and familiarity with genomic data, as well as its utility at various clinical decision points. These data shaped the design of multiple prototype reports that were compared against the existing report through a second online survey, with the resulting qualitative and quantitative data informing the final, redesigned, report.\n<\/p><p><b>Results<\/b>: We recruited 78 participants, 65 of whom were clinicians, nurses, laboratorians, researchers, and epidemiologists involved in TB diagnosis, treatment, and\/or surveillance. Our first survey indicated that participants were largely enthusiastic about genomic data, with the majority agreeing on its utility for certain TB diagnosis and treatment tasks and many reporting some confidence in their ability to interpret this type of data (between 58.8% and 94.1%, depending on the specific data type). When we compared our four prototype reports against the existing design, we found that for the majority (86.7%) of design comparisons, participants preferred the alternative prototype designs over the existing version, and that both clinicians and non-clinicians expressed similar design preferences. Participants showed clearer design preferences when asked to compare individual design elements versus entire reports. Both the quantitative and qualitative data informed the design of a revised report, available online as a LaTeX template.\n<\/p><p><b>Conclusions<\/b>: We show how a human-centered design approach integrating quantitative and qualitative feedback can be used to design an alternative report for representing complex microbial genomic data. We suggest experimental and design guidelines to inform future design studies in the <a href=\"https:\/\/www.limswiki.org\/index.php\/Bioinformatics\" title=\"Bioinformatics\" target=\"_blank\" class=\"wiki-link\" data-key=\"8f506695fdbb26e3f314da308f8c053b\">bioinformatics<\/a> and microbial genomics domains. We also suggest that this type of mixed-methods study is important to facilitate the successful translation of pathogen genomics in the clinic, not only for clinical reports but also more complex bioinformatics data visualization software.\n<\/p><p><b>Keywords<\/b>: human-centered design, next-generation sequencing, report, tuberculosis, genome\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Introduction\">Introduction<\/span><\/h2>\n<p>Whole genome sequencing (WGS) is quickly moving from proof-of-concept research into routine clinical and public health use. WGS can diagnose infections at least as accurately as current protocols<sup id=\"rdp-ebb-cite_ref-FukuiMeta15_1-0\" class=\"reference\"><a href=\"#cite_note-FukuiMeta15-1\" rel=\"external_link\">[1]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-LomanACulture13_2-0\" class=\"reference\"><a href=\"#cite_note-LomanACulture13-2\" rel=\"external_link\">[2]<\/a><\/sup>, can predict antimicrobial resistance phenotypes for certain drugs<sup id=\"rdp-ebb-cite_ref-BradleyRapid15_3-0\" class=\"reference\"><a href=\"#cite_note-BradleyRapid15-3\" rel=\"external_link\">[3]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-PankhurstRapid16_4-0\" class=\"reference\"><a href=\"#cite_note-PankhurstRapid16-4\" rel=\"external_link\">[4]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-WalkerWhole15_5-0\" class=\"reference\"><a href=\"#cite_note-WalkerWhole15-5\" rel=\"external_link\">[5]<\/a><\/sup> with high concordance to culture-based testing methods, and can be used in outbreak surveillance to resolve transmission clusters at a resolution not possible with existing genomic or epidemiological methods.<sup id=\"rdp-ebb-cite_ref-NikolayevskyyWhole16_6-0\" class=\"reference\"><a href=\"#cite_note-NikolayevskyyWhole16-6\" rel=\"external_link\">[6]<\/a><\/sup> Importantly, WGS offers faster turnaround times compared to many culture-based tests, particularly for antimicrobial resistance testing in slow-growing bacteria.\n<\/p><p>As reference microbiology laboratories move towards accreditation of WGS for routine clinical use, the community is turning its attention toward standardization, developing standard operating procedures for reproducible sample handling, sequencing, and downstream bioinformatics analysis.<sup id=\"rdp-ebb-cite_ref-BudowleValid14_7-0\" class=\"reference\"><a href=\"#cite_note-BudowleValid14-7\" rel=\"external_link\">[7]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-GargisAssur16_8-0\" class=\"reference\"><a href=\"#cite_note-GargisAssur16-8\" rel=\"external_link\">[8]<\/a><\/sup> Reporting genomic microbiology test results in a way that is interpretable by clinicians, nurses, laboratory staff, researchers, and surveillance experts and that meets regulatory requirements is equally important; however, relatively little effort has been directed toward this area. WGS clinical reports are often produced in-house on an <i>ad hoc<\/i>, project-by-project basis, with the resulting product not necessarily meeting the needs of the many stakeholders using the report in their clinical and surveillance workflows.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Human-centered_design_in_the_clinical_laboratory\">Human-centered design in the clinical laboratory<\/span><\/h3>\n<p>The information visualization, human\u2013computer interaction, and usability engineering fields offer techniques and design guidelines that have informed bioinformatics tools, including Disease View<sup id=\"rdp-ebb-cite_ref-DriscollInteg11_9-0\" class=\"reference\"><a href=\"#cite_note-DriscollInteg11-9\" rel=\"external_link\">[9]<\/a><\/sup> for exploring host-pathogen interaction data and Microreact<sup id=\"rdp-ebb-cite_ref-Argim.C3.B3nMicro16_10-0\" class=\"reference\"><a href=\"#cite_note-Argim.C3.B3nMicro16-10\" rel=\"external_link\">[10]<\/a><\/sup> for visualizing phylogenetic trees in the context of epidemiological or clinical data. Although the public health community is beginning to recognize the potential role of visualization and analytics in daily laboratory workflows<sup id=\"rdp-ebb-cite_ref-CarrollVisual14_11-0\" class=\"reference\"><a href=\"#cite_note-CarrollVisual14-11\" rel=\"external_link\">[11]<\/a><\/sup> these techniques have not yet been applied to routine reporting of microbiological test results. However, work from the human health domain \u2014 particularly the formatting and display of pathology reports, where standardization is critical<sup id=\"rdp-ebb-cite_ref-LeslieStandard94_12-0\" class=\"reference\"><a href=\"#cite_note-LeslieStandard94-12\" rel=\"external_link\">[12]<\/a><\/sup> \u2014 sheds light on the complex task of clinical report design.\n<\/p><p>Valenstein reports four principles for organizing an effective pathology report: use headlines to emphasize key points, ensure design continuity over time and relative to other reports, consider information density, and reduce clutter<sup id=\"rdp-ebb-cite_ref-ValensteinFormat08_13-0\" class=\"reference\"><a href=\"#cite_note-ValensteinFormat08-13\" rel=\"external_link\">[13]<\/a><\/sup>, while Renshaw <i>et al.<\/i><sup id=\"rdp-ebb-cite_ref-RenshawTheImpact14_14-0\" class=\"reference\"><a href=\"#cite_note-RenshawTheImpact14-14\" rel=\"external_link\">[14]<\/a><\/sup> note that when pathology report templates were reformatted with numbering and bolding to highlight required information, template completion rates rose from 84 to 98%. Fixed, consistent layout of medical record elements, highlighting of data relative to background text, and single-page layout improve clinicians\u2019 ability to locate information<sup id=\"rdp-ebb-cite_ref-NygrenHelping98_15-0\" class=\"reference\"><a href=\"#cite_note-NygrenHelping98-15\" rel=\"external_link\">[15]<\/a><\/sup>, while <a href=\"https:\/\/www.limswiki.org\/index.php\/Information\" title=\"Information\" target=\"_blank\" class=\"wiki-link\" data-key=\"6300a14d9c2776dcca0999b5ed940e7d\">information<\/a> design principles, including visually structuring the document to separate different elements and organizing information to meet the needs of multiple stakeholder types, can reduce the number of errors in data interpretation.<sup id=\"rdp-ebb-cite_ref-WrightHow98_16-0\" class=\"reference\"><a href=\"#cite_note-WrightHow98-16\" rel=\"external_link\">[16]<\/a><\/sup>\n<\/p><p>Work in the <a href=\"https:\/\/www.limswiki.org\/index.php\/Electronic_health_record\" title=\"Electronic health record\" target=\"_blank\" class=\"wiki-link\" data-key=\"f2e31a73217185bb01389404c1fd5255\">electronic health record<\/a> (EHR) and patient risk communication domains has also provided insight into not just the final product but also the process of effective design. Through quantitative and qualitative evaluations, research has shown that some EHRs are difficult to use because they were not designed to support clinical tasks and information retrieval, but rather data entry.<sup id=\"rdp-ebb-cite_ref-WrightHow98_16-1\" class=\"reference\"><a href=\"#cite_note-WrightHow98-16\" rel=\"external_link\">[16]<\/a><\/sup> Reviews of the risk communication literature note that while many visual aids improve patients\u2019 understanding of risk<sup id=\"rdp-ebb-cite_ref-ZipkinEvidence14_17-0\" class=\"reference\"><a href=\"#cite_note-ZipkinEvidence14-17\" rel=\"external_link\">[17]<\/a><\/sup>, the design features that viewers preferred \u2014 namely simplistic, minimalist designs \u2014 were not necessarily those that led to an accurate interpretation of the underlying data.<sup id=\"rdp-ebb-cite_ref-AnckerDesign06_18-0\" class=\"reference\"><a href=\"#cite_note-AnckerDesign06-18\" rel=\"external_link\">[18]<\/a><\/sup> Together, these gaps indicate a need for a human-centered, participatory approach iteratively incorporating both design and evaluation.<sup id=\"rdp-ebb-cite_ref-HettingerCog17_19-0\" class=\"reference\"><a href=\"#cite_note-HettingerCog17-19\" rel=\"external_link\">[19]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-HorskyInterface12_20-0\" class=\"reference\"><a href=\"#cite_note-HorskyInterface12-20\" rel=\"external_link\">[20]<\/a><\/sup>\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Collaboration_context.E2.80.94COMPASS-TB\">Collaboration context\u2014COMPASS-TB<\/span><\/h3>\n<p>The COMPASS-TB project was a proof-of-concept study demonstrating the feasibility and utility of WGS for diagnosing tuberculosis (TB) infection, evaluating an isolate\u2019s antimicrobial sensitivity\/resistance and genotyping the isolate to identify epidemiologically related cases.<sup id=\"rdp-ebb-cite_ref-PankhurstRapid16_4-1\" class=\"reference\"><a href=\"#cite_note-PankhurstRapid16-4\" rel=\"external_link\">[4]<\/a><\/sup> On the basis of COMPASS-TB\u2019s results, Public Health England (PHE) has implemented routine WGS in the TB reference laboratory<sup id=\"rdp-ebb-cite_ref-PHETuber16_21-0\" class=\"reference\"><a href=\"#cite_note-PHETuber16-21\" rel=\"external_link\">[21]<\/a><\/sup>; however, this requires changing how mycobacteriology results are reported to clinical and public health stakeholders. The COMPASS-TB pilot used reports designed by the project team, but as clinical implementation within PHE progressed, team members expressed an interest in redesigning the report (Fig. 1) to facilitate interpretation of this new data type and align <a href=\"https:\/\/www.limswiki.org\/index.php\/Laboratory\" title=\"Laboratory\" target=\"_blank\" class=\"wiki-link\" data-key=\"c57fc5aac9e4abf31dccae81df664c33\">laboratory<\/a> reporting practices with the needs of multiple TB stakeholders.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig1_Crisan_PeerJ2018_6.jpg\" class=\"image wiki-link\" target=\"_blank\" data-key=\"a82cf96de00dde524ddda88c504d22bb\"><img alt=\"Fig1 Crisan PeerJ2018 6.jpg\" src=\"https:\/\/www.limswiki.org\/images\/b\/bf\/Fig1_Crisan_PeerJ2018_6.jpg\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Fig. 1<\/b> An earlier COMPASS-TB report design<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>We undertook a mixed-methods and iterative human-centered approach to inform the design and evaluation of a clinical TB WGS report. Specifically, we chose to use design study methodology<sup id=\"rdp-ebb-cite_ref-SedlmairDesign12_22-0\" class=\"reference\"><a href=\"#cite_note-SedlmairDesign12-22\" rel=\"external_link\">[22]<\/a><\/sup>, an approach adopted from the information visualization discipline. When using a design study methodology approach, researchers examine a problem faced by a group of domain specialists, explore their available data and the tasks they perform in reference to that problem, create a product \u2014 in our case a report, but, in the more general case, a visualization system \u2014 to help solve the problem, assess the product with domain specialists, and reflect on the process to improve future design activities. Compared to an <i>ad hoc<\/i> approach to design, design study methodology engages domain specialists and grounds the design and evaluation of the visualization system in tasks \u2014 in this case TB diagnosis, treatment, and surveillance \u2014 as well as data. It is this marriage of data and tasks to design choices, informed by real needs and supported by empirical evidence, that results in a final product that is relevant, usable, and interpretable.\n<\/p><p>Here we describe our application of design study methodology to the COMPASS-TB report redesign. Targeting clinical and public health stakeholders with at least some familiarity with public health genomics, we show how evidence-based design can be incorporated into the emerging field of clinical microbial genomics and present a final report template, which may be ported to other organisms. We also recommend a set of guidelines to support future applications of human-centered design in microbial genomics, whether for report designs or for more complex bioinformatics visualization software.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Materials_and_methods\">Materials and methods<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"Overview_of_design_study_methodology\">Overview of design study methodology<\/span><\/h3>\n<p>The design study methodology<sup id=\"rdp-ebb-cite_ref-SedlmairDesign12_22-1\" class=\"reference\"><a href=\"#cite_note-SedlmairDesign12-22\" rel=\"external_link\">[22]<\/a><\/sup> is an iterative framework outlining an approach to human-centered visualization design and evaluation. It consists of three phases \u2014 precondition, core analysis, and reflection \u2014 that together comprise nine stages. The precondition and reflection phases focus on establishing collaborations and writing up research findings, respectively, and are not elaborated upon further here. We describe our work within each of the three stages of the core analysis phase: discovery, design, and implementation (Fig. 2). We define domain specialists in this case as the TB stakeholders \u2014 clinicians, laboratorians, and epidemiologists \u2014 who regularly use reports from the reference mycobacteriology laboratory in their work.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig2_Crisan_PeerJ2018_6.jpg\" class=\"image wiki-link\" target=\"_blank\" data-key=\"a817fb8892b85471fccbb224c4bead01\"><img alt=\"Fig2 Crisan PeerJ2018 6.jpg\" src=\"https:\/\/www.limswiki.org\/images\/1\/1c\/Fig2_Crisan_PeerJ2018_6.jpg\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Fig. 2<\/b> Our human-centered design approach. The core analysis phase of the design study methodology consists of discovery, design, and implementation stages. Using this methodological backbone, we collected and analyzed data using mixed-methods study designs in the discovery and design stages, which informed the final TB WGS clinical report design.<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>Our research was reviewed and approved by the University of British Columbia\u2019s Behavioural Research Ethics Board (H10-03336). All data were collected through secure means approved by the university and were de-identified for analysis and sharing. Anonymized quantitative results from each of the surveys and the analysis code are available at <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/github.com\/amcrisan\/TBReportRedesign\" target=\"_blank\">https:\/\/github.com\/amcrisan\/TBReportRedesign<\/a> and in the Supplemental Information 1. We also provide the full text of our survey instruments in the Supplemental Information 1.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Discovery_stage\">Discovery stage<\/span><\/h3>\n<p>In the discovery stage, we first gathered qualitative data through expert consults to identify the data types used in TB diagnosis, treatment, and surveillance tasks. We then gathered quantitative data through an online survey to more robustly link particular data types to specific tasks. This staged approach to data gathering is known as the exploratory sequential model.<sup id=\"rdp-ebb-cite_ref-CreswellResearch14_23-0\" class=\"reference\"><a href=\"#cite_note-CreswellResearch14-23\" rel=\"external_link\">[23]<\/a><\/sup>\n<\/p><p>Our expert consults took the form of semi-structured interviews with seven individuals recruited from the COMPASS-TB project team, the British Columbia Centre for Disease Control (BCCDC), and the British Columbia Public Health Laboratory (BCPHL). The interview questions served as prompts to structure the conversation, but experts were free to comment, at any depth, on the different aspects of TB diagnosis, treatment, and surveillance. We took notes during the consults in order to identify the tasks and data types common to TB workflows in the U.K. and Canada, as well as to determine which tasks could be supported by WGS data.\n<\/p><p>Informed by the expert consults, we drafted a task and data questionnaire (text in Supplemental Information 1) to survey data types used across the TB workflow (see results for a list of data types), the role for WGS data in diagnosis, treatment, and surveillance tasks; and participants\u2019 confidence in interpreting different data types. The questionnaire primarily used multiple choice and true\/false type questions, but it also included the optional entry of freeform text. The questionnaire was deployed online using the FluidSurveys platform, and participants were recruited using snowball and convenience sampling for a one-week period in July 2016. For questions pertaining to diagnostic and treatment tasks, we gathered information only from participants self-identifying as clinicians; for the remaining sections of the survey, all participants were prompted to answer each question.\n<\/p><p>Only completed questionnaires were used for analysis. For questions pertaining to participants\u2019 background, their perception of WGS utility, and their confidence interpreting WGS data, we report primarily descriptive statistics. To link TB workflow tasks to specific data types, we presented participants with different task-based scenarios related to diagnosis, treatment, and surveillance and asked which data types they would use to complete the task. For each pair of data and task we assigned a consensus score depending on the proportion of participants who reported using a data type for a specific task: 0 for fewer than 25% of participants, 1 for 25\u201350%, 2 for 50\u201375%, and 3 if more than 75% of participants reported using a specific data type for the task at hand. Consensus scores for a data type were also summed across the different tasks. Freeform text, when it was provided, was considered only to add context to participant responses.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Design_stage\">Design stage<\/span><\/h3>\n<p>The discovery stage revealed which data types to include in the redesigned report, while the goal of the design stage was to identify how it should be presented. We used a \"design sprint\" event to produce a series of prototype reports, which were then assessed through a second online questionnaire. This survey collected quantitative data on participants\u2019 preference for specific design elements, with participants also able to provide qualitative feedback on each element \u2014 a type of embedded mixed methods study design.<sup id=\"rdp-ebb-cite_ref-CreswellResearch14_23-1\" class=\"reference\"><a href=\"#cite_note-CreswellResearch14-23\" rel=\"external_link\">[23]<\/a><\/sup>\n<\/p><p>The design sprint was an interactive design session involving members of the University of British Columbia\u2019s Information Visualization research group, in which teams created alternative designs to report WGS data for the diagnosis, treatment, and surveillance tasks. Teams developed paper prototypes<sup id=\"rdp-ebb-cite_ref-LloydHuman11_24-0\" class=\"reference\"><a href=\"#cite_note-LloydHuman11-24\" rel=\"external_link\">[24]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-VradenburgASurv02_25-0\" class=\"reference\"><a href=\"#cite_note-VradenburgASurv02-25\" rel=\"external_link\">[25]<\/a><\/sup> of a complete WGS TB report and, at the completion of the event, presented their prototypes and the rationale for each design choice. The paper prototypes were then digitally mocked up, both as complete reports and as individual elements (see the results in Figs. 3 and 4); these digital prototypes were standardized with respect to text, fonts, and sample data where appropriate and used as the basis of the second online survey.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig3_Crisan_PeerJ2018_6.jpg\" class=\"image wiki-link\" target=\"_blank\" data-key=\"07d8bb5d36dbf39f683b7dcc6f056f79\"><img alt=\"Fig3 Crisan PeerJ2018 6.jpg\" src=\"https:\/\/www.limswiki.org\/images\/b\/bb\/Fig3_Crisan_PeerJ2018_6.jpg\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Fig. 3<\/b> Digital mockups of complete report prototypes generated during the design sprint. (A) Prototype report 1, (B) prototype report 2, (C) prototype report 3, (D) prototype report 4<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p><a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig4_Crisan_PeerJ2018_6.jpg\" class=\"image wiki-link\" target=\"_blank\" data-key=\"862a6e7ed08ba4c2d0761938439bf089\"><img alt=\"Fig4 Crisan PeerJ2018 6.jpg\" src=\"https:\/\/www.limswiki.org\/images\/0\/0a\/Fig4_Crisan_PeerJ2018_6.jpg\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Fig. 4<\/b> Isolated design elements. The original report element, highlighted in red, is broken down into isolated design elements, each of which was tested independently in the report design survey. In this example, the original resistance summary yields five different alternative wordings and design elements.<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>In the design choice questionnaire (text in Supplemental Information 1), we evaluated participants\u2019 preferences for individual design elements, comparing the options generated during the design sprint as well as the initial COMPASS-TB report design, which we hereafter refer to as the control design. As with the first survey, the questionnaire used FluidSurveys, with participants recruited using snowball and convenience sampling. Individuals who had previously participated in the data and task questionnaire were also invited to participate. The survey was open for one month beginning September 10, 2016 and was reopened to recruit additional participants for one month beginning January 5, 2017, as part of the registration for a TB WGS conference hosted by PHE. Only completed surveys were analyzed.\n<\/p><p>We used single-selection multiple-choice, Likert scale, and ranking questions to assess participant preferences. For multiple-choice and Likert scale questions, we calculated the number of participants that selected each option and report the sum. For questions that required participants to rank options we calculated a rescaled rank score as follows:\n<\/p><p><span id=\"rdp-ebb-M1\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M1\" aria-hidden=\"true\" style=\"background-image: url('https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/b1aa672bb3a6822de06911af3967c3650fb6276f'); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -2.005ex; width:48.207ex; height:6.676ex;\" \/><\/span>\n<\/p><p>where for each design choice (<i>D<sub>i<\/sub><\/i>), <i>i<\/i> = {1\u2026<i>N<\/i>} where <i>N<\/i> is the total number of design choices, <i>R<\/i> = {1\u2026<i>N<\/i>} is a raw rank (rank selected by a participant in the study), and <i>P<\/i> = {1\u2026<i>P<\/i>} is the total number of participants. In our study, 1 was the highest rank (most preferred) and <i>N<\/i> was the lowest rank (least preferred) option. As an example, if a design, <i>D<sub>1<\/sub><\/i>, is always ranked 1 (greatest preference by everyone), the sum of those ranks is <i>P<\/i>, resulting in a numerator of 0 and a rescaled rank score of 1. Alternatively, if a design, <i>D<sub>2<\/sub><\/i>, is always ranked last (<i>N<\/i>), the sum of those ranks will be <i>P<\/i>\u2217<i>N<\/i>, and a rescaled rank score of 0. Thus, the rescaled rank score ranges from 1 (consistently ranked as first) to 0 (consistently ranked last). This transformation from raw to rescaled ranks allows us to compare across questions with different numbers of options, but is predicated on each design alternative having a rank, which is why this approach was not extended to multiple choice questions.\n<\/p><p>To contextualize rescaled rank scores, we randomly permuted participants\u2019 scores 1,000 times and pooled the rescaled rank scores across these iterations to obtain an average score (intuitively and empirically this is 0.5 for the rank questions and <span id=\"rdp-ebb-M2\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\" style=\"display: none;\"><\/span><meta class=\"mwe-math-fallback-image-inline\" id=\"rdp-ebb-M2\" aria-hidden=\"true\" style=\"background-image: url('https:\/\/en.wikipedia.org\/api\/rest_v1\/media\/math\/render\/svg\/8fb12c5f73f8763997550a9abdf59b00c4932ee4'); background-repeat: no-repeat; background-size: 100% 100%; vertical-align: -1.838ex; width:2.9ex; height:5.176ex;\" \/><\/span> for multiple choice questions) and standard deviation. For each design choice, we plotted its actual rescaled rank score against the distribution of random permutations, highlighting whether the score was within \u00b1 1, 2, or 3 standard deviations from the random permutation mean score. The closer a score was to the mean, the more probable that the participants\u2019 preferences were no better than random. We also calculated bootstrapped 95% confidence intervals for both rank and multiple choice type questions by re-sampling participants, with replacement, over 1,000 iterations.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Implementation_stage\">Implementation stage<\/span><\/h3>\n<p>By combining the results of the design choice questionnaire with medical test reporting requirements from the <a href=\"https:\/\/www.limswiki.org\/index.php\/ISO_15189\" title=\"ISO 15189\" target=\"_blank\" class=\"wiki-link\" data-key=\"e7867fe884a6e63d87c5a1bff5c28bc2\">ISO 15189:2012<\/a> standards, we developed a final template for reporting TB WGS data in the clinical laboratory. We used deviation from a random score, described in the methods, as an indicator of preference, with strong preferences being three or more deviations from the random score. When there was no strongly preferred element, we explain our design choice in the design walkthrough (Supplemental Information 1). We also considered consensus between clinicians and non-clinicians and defaulted to clinician preferences in instances of disagreement as they are the primary consumers of this report. The final prototype is implemented in LaTeX and is available online as a template accessible at <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.cs.ubc.ca\/labs\/imager\/tr\/2017\/MicroReportDesign\/\" target=\"_blank\">http:\/\/www.cs.ubc.ca\/labs\/imager\/tr\/2017\/MicroReportDesign\/<\/a>.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Results\">Results<\/span><\/h2>\n<p>Expert consults, the task and data questionnaire, and the design choice questionnaires recruited a total of 78 participants across different roles in TB management and control (Table 1).\n<\/p>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table class=\"wikitable\" border=\"1\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" colspan=\"4\"><b>Table 1.<\/b> Total study participants across different stages of the design study methodology. Notes: <sup>*<\/sup>National Reference Laboratory Service\n<\/td><\/tr>\n<tr>\n<th style=\"padding-left:10px; padding-right:10px;\">\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">Expert consults\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">Task and data questionnaire\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">Design choice questionnaire\n<\/th><\/tr>\n<tr>\n<th style=\"padding-left:10px; padding-right:10px;\">Stage\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\" colspan=\"2\">Discovery\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">Design\n<\/th><\/tr>\n<tr>\n<th style=\"padding-left:10px; padding-right:10px;\">Data collected\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">Qualitative\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">Quantitative\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">Qualitative and Quantitative\n<\/th><\/tr>\n<tr>\n<th style=\"padding-left:10px; padding-right:10px;\">Participants\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\"><i>N<\/i> (% survey total)\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\"><i>N<\/i> (% survey total)\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\"><i>N<\/i> (% survey total)\n<\/th><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Clinician\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2 (29%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">7 (40%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">13 (25%)\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Nurse\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">1 (14%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">3 (18%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">5 (9%)\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Laboratory\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2 (29%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">3 (18%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">8 (15%)\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Research\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">0 (0%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">1 (6%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">8 (15%)\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Surveillance\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">1 (14%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">3 (18%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">8 (15%)\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Other<sup>*<\/sup>\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">1 (14%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">0 (0%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">12 (21%)\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">TOTAL\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">7 (100%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">17 (100%)\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">54 (100%)\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h3><span class=\"mw-headline\" id=\"Experts_emphasized_prioritizing_information_and_revealed_constraints\">Experts emphasized prioritizing information and revealed constraints<\/span><\/h3>\n<p>The objective of our expert consults was to understand how reports from the reference mycobacteriology laboratory are currently used in the day-to-day workflows of various TB stakeholders (including clinicians, laboratorians, epidemiologists, and researchers) and what data types are currently used to inform those tasks. Tasks and data types enumerated in the interviews were used to populate downstream quantitative questionnaires; however, the interviews also provided insights into how stakeholders viewed the role of <a href=\"https:\/\/www.limswiki.org\/index.php\/Genomics\" title=\"Genomics\" target=\"_blank\" class=\"wiki-link\" data-key=\"96a82dabf51cf9510dd00c5a03396c44\">genomics<\/a> in a clinical laboratory.\n<\/p><p>Amongst the procedural insights, stakeholders frequently reported that the biggest benefit of WGS over standard mycobacteriology laboratory protocols was to improve testing turnaround times and gather all test results into a single document, rather than having multiple lab reports arriving over weeks to months. Several experts emphasized that these benefits can only be realized if the WGS analytical pipeline has been clinically validated. Although our study team included a clinician and a TB researcher, two surprising procedural insights emerged from the consultations. First, multiple experts from a clinical background emphasized that this audience has extremely limited time to digest the information found on a clinical report. In describing their interaction with a laboratory report, one participant noted that \u201c10 seconds [to review content] is likely, one minute is luxurious\u201d while others described variations on the theme of wanting bottom-line, actionable information as quickly as possible. This insight profoundly shaped downstream decisions around how much data to include on a redesigned report and how to arrange it over the report to permit both a quick glance and a deeper dive. Second, experts indicated that laboratory reports were delivered using a variety of formats, including PDFs appended to electronic health records, faxes, or physical mail. This created design constraints at the outset of the project; our redesigned report needed to be legible no matter the medium, ruling out online interactivity, and needed to be black and white.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Experts_vary_in_their_perception_of_different_data_types\">Experts vary in their perception of different data types<\/span><\/h3>\n<p>At the data level, we observed that the experts had differing perceptions of data types and desired level of detail between clinicians and non-clinicians, perhaps reflecting the clinicians\u2019 procedural need for rapid interpretation. Clinicians emphasized the importance of presenting actionable results clearly and omitting those that were not clinically relevant for them. For example, when presented with the sequence quality data on the current COMPASS-TB report (Fig. 1) \u2014 metrics reflecting the quality of the sequencing run and downstream bioinformatics analysis \u2014 interviewees did not expect the lab to release poor quality data, given the presence of strict quality control mechanisms. ISO 15189:2012 standards require some degree of reporting around the measurement procedure and results, but this insight suggested such data might best be placed later in the report in a simplified format, or described in the report comments. Similarly, experts were also divided on the interpretability and utility of the phylogenetic tree in the epidemiological relatedness section of the current COMPASS-TB report, with clinicians noting that the case belonging to an epidemiological cluster would not impact their use of the genomic test results.\n<\/p><p>Experts also disagreed about the level of detail needed for WGS data, and this appeared to depend upon on whether the expert was a clinician as well as their prior experience with WGS through the COMPASS-TB project. For example, one expert indicated that \u201cclinicians are wanting to know which mutations conferred resistance\u201d, while another noted that they \u201cdon\u2019t use these [mutations] right now routinely, so it\u2019s not that relevant\u201d. When asked to comment on the resistance summary table in the current COMPASS-TB report (Fig. 1), clinicians were concerned about the use of abbreviations for both drug names and susceptibility status leading to misinterpretation, and many were uncertain how to use the detailed mutation information in the resistotype table.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"WGS_data_is_vital.2C_but_some_lack_confidence_in_its_interpretation\">WGS data is vital, but some lack confidence in its interpretation<\/span><\/h3>\n<p>The expert consults provided a detailed overview of the tasks and data associated with TB care, allowing us to create a draft workflow outlining the TB diagnosis, treatment, and surveillance tasks coupled to the supporting data sources and data types (Fig. S1). This workflow was used to design the task and data questionnaire.\n<\/p><p>Of the 17 participants responding in full to the task and data questionnaire (Table 1), most were from the United Kingdom (88%) and most reported professional experience and formal education in infectious diseases and epidemiology (Table S1). Participants were less likely to report education at the masters or doctoral level in microbial genomics, biochemistry, or bioinformatics (Table S1). Fewer than half (47.1%) of participants had participated in TB WGS projects, but all (100%) participants were enthusiastic about the role of microbial genomics in infectious disease diagnosis, both today (47.1%) and in the near future, pending clinical validation (52.9%).\n<\/p><p>When queried about their potential future use of molecular data \u2014 whether WGS, genotyping, or other \u2014 participants indicated they foresaw themselves consulting, often or all the time, data on resistance-conferring mutations (82.3% of participants), MIRU-VNTR patterns (88.2%), epidemiological cluster membership (76.5%), single nucleotide polymorphism\/variant distances from other isolates (64.7%), and WGS quality metrics (58.8%) (Table S2). However, of the 14 different data types queried, the majority of participants only felt confident in interpreting four (MIRU-VNTR, drug susceptibility from culture, drug susceptibility from PCR or LPA, genomic clusters); most participants only felt somewhat confident, or not confident at all, interpreting the other data types (Table S3).\n<\/p><p>Moving from confidence in their own interpretation of laboratory data types to confidence in the utility of WGS data in general, the majority of participants were confident that information contained within the TB genome can be used to correctly perform organism speciation (76.5%), assign a patient to existing clusters (70.0%), rule out transmission events (64.7%), and to a lesser extent were confident TB WGS could be used to identify epidemiologically-related patients (58.8%) and predict drug susceptibility (52.9%) (Table S4). The majority of participants thought genomic data may be able to inform clinicians of appropriate treatment regimens (100%) and identify transmission events (94.1%); however, participants showed mixed consensus toward whether genomic data could be used to monitor treatment progress for TB (47.2%) or diagnose active TB (52.9%).\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Respondent_consensus_suggests_a_role_for_WGS_in_diagnosis_and_treatment_tasks\">Respondent consensus suggests a role for WGS in diagnosis and treatment tasks<\/span><\/h3>\n<p>To examine which data types were being used to support diagnosis, treatment, and surveillance tasks in the workflow, we assigned a numerical score reflecting respondent consensus around each data type-task pair (Fig. 5). We found greater consensus around the data types that participants would use in diagnosis and treatment tasks, but little consensus around the data they would use for surveillance tasks, contrasting with participants\u2019 previously stated support for using WGS or other genotyping data for understanding TB epidemiology. Overall, the most frequently used data types included administrative data (patient ID, sample type, collection site, collection date) and results from current laboratory tests (solid or liquid culture, smear status, and speciation), which together were used primarily for diagnosis and treatment. Prior test results from a patient were deemed important; however, the earlier expert consults indicated that such data was difficult to obtain and unlikely to be included in future reports.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig5_Crisan_PeerJ2018_6.jpg\" class=\"image wiki-link\" target=\"_blank\" data-key=\"48049fd9a24b75e0076c8574c562ace4\"><img alt=\"Fig5 Crisan PeerJ2018 6.jpg\" src=\"https:\/\/www.limswiki.org\/images\/6\/66\/Fig5_Crisan_PeerJ2018_6.jpg\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Fig. 5<\/b> Extent of consensus between TB workflow tasks and available TB data. Results are redundantly encoded using color and a numerical value to represent the degree of consensus between participants around using a specific data type to carry out a specific task.<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>We also queried participants\u2019 perceptions of barriers impacting their workflow, with the majority of participants (83.3%) reporting issues with both the timeliness of receiving TB data from the reference laboratory and the distribution of test results across multiple documents (Table S5), a finding that corroborated the procedural insights from the expert consults.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Prototyping_via_a_design_sprint_produces_a_range_of_design_alternatives\">Prototyping via a design sprint produces a range of design alternatives<\/span><\/h3>\n<p>Equipped with an understanding of how WGS data might be used in the various TB workflow tasks, we embarked on the design stage of the design study methodology. A design sprint event involving study team members and information visualization experts resulted in four prototype report designs (Fig. 3) and various isolated design elements (Fig. 4). Although each prototype used different design elements for the required data types, when the prototypes were compared at the end of the event, common themes emerged. These included presenting data in an order informed by the workflow\u2014data related to diagnosis, treatment, then surveillance; placing actionable, high-level data on the front page, with additional details on the over page; and using both an overall summary statement at the beginning of the report as well as brief summary statements at the beginning of each section.\n<\/p><p>To drill down and determine which design elements best communicate the underlying data, we isolated individual design elements (Fig. 4) and classified them as wording choices, e.g., which heading to use for a given section of the report\u2014or design choices, such as layout, the use of emphasis, and the use of graphics (Table S6).\n<\/p>\n<h3><span class=\"mw-headline\" id=\"The_design_choice_questionnaire_quantifies_participant_preferences_for_specific_design_elements\">The design choice questionnaire quantifies participant preferences for specific design elements<\/span><\/h3>\n<p>We next developed an online survey, the design choice questionnaire, to assess stakeholders\u2019 preferences for both specific design elements and overall report prototypes. The distribution of public health roles among survey participants is presented in Table 1; all but 11 participants (20%) actively worked with TB data. Participants were employed by academic institutions (35.2%), <a href=\"https:\/\/www.limswiki.org\/index.php\/Hospital\" title=\"Hospital\" target=\"_blank\" class=\"wiki-link\" data-key=\"b8f070c66d8123fe91063594befebdff\">hospitals<\/a> (24.1%), and public health organizations (33.3%), with only 7.4% of participants being employed in some other sector. The majority of participants were from the U.K. (59.2%), while 11.1% were from Canada; the remaining 29.7% were drawn from the United States (6.5%), Europe (14.8%), Brazil (2.8%), India (2.8%), and Gambia (2.8%).\n<\/p><p>We first examined participants\u2019 preference for specific wording and design elements (Figs. 6A and 6B), comparing elements arising from the prototypes to those used in the existing COMPASS-TB report, which acted as a control. Notably, of the 15 wording and design elements queried, in only two cases was the control design preferred over a design arising from one of the prototypes (note that one query did not compare to a control). Furthermore, in eight out of 15 queries (Q6, Q8, Q9, Q10, Q12, Q17, Q5, Q18) participants showed strong preferences, wherein the top preference was +3 or more standard deviations from the mean for both clinicians and non-clinicians. Figure S1 provides a version of Fig. 6 with confidence intervals and indicates concordance between strong preferences and non-overlapping confidence intervals.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig6_Crisan_PeerJ2018_6.jpg\" class=\"image wiki-link\" target=\"_blank\" data-key=\"f09a522019ef2193c20b542f79c7025a\"><img alt=\"Fig6 Crisan PeerJ2018 6.jpg\" src=\"https:\/\/www.limswiki.org\/images\/2\/27\/Fig6_Crisan_PeerJ2018_6.jpg\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Fig. 6<\/b> Design Choice Questionnaire results. Responses are grouped according to question type: wording (A), design choices (B), and full reports (C), and partitioned into clinician participants (squares) and non-clinician participants (circles). Responses are colored according to whether they are the control design from the original report (white) or an alternative design devised in the design sprint (black). Lines connect options between clinician and non-clinicians preferences, with thicker crossing lines showing discordance between the two groups and vertical lines showing concordance in preferences. Rescaled rank scores are shown against a reference of random permutations (see Methods), with scores closer to 1 indicating the most preferred response. Specific questions are indicated with Q; the questions as presented to the participants are shown in Table S6.<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>The findings from the analysis of wording elements (Fig. 6A) showed that participants preferred complete terms to abbreviations, such as writing out \u201cisoniazid\u201d as opposed to \u201cINH\u201d or \u201cH\u201d, or \u201cresistant\u201d as opposed to \u201cR\u201d, and that both clinicians and non-clinicians were in agreement over the preferred vocabulary for section headings. Interestingly, wording questions related to the treatment task yielded the widest range of rankings.\n<\/p><p>Clear preferences were also observed for information design elements, again largely concordant between clinicians and non-clinicians (Fig. 6B). Participants preferred elements that drew attention to specific data such as summary statements, shading, and tick boxes, and many participants preferred that sections be prioritized, with less important details relegated to the second page of the report. However, there was less consensus around how much detail to include and where. The majority of participants indicated that genomic data pertaining to resistance-conferring mutations should be included (Fig. 6B; Q11), but were divided as which data should be included and where. Most (85%) wanted to know the gene harboring the resistance mutation (i.e., katG; inhA), but only half wanted details of the specific mutation (50% wanted the amino acid substitution, 46% wanted to know the nucleotide-level change). We did not test any design elements displaying the strength of the association between the mutation and the resistance phenotype; however, we will add this to a future version of the report pending receipt of the final mutation catalog from the ReSeqTB Consortium.\n<\/p><p>Interestingly, while both clinicians and non-clinicians reported similar rankings for most design elements, one element showed an unusual distribution of scores: the visualization for showing genomic relatedness and membership in a cluster. While both groups of participants preferred a phylogenetic tree accompanied by a summary table, which is the current COMPASS-TB control design, the other four options appeared to be ranked randomly, with rescaled rank score close to 0.5, suggesting that none of the alternative options were particularly good.\n<\/p><p>We also had participants rank their preferences for the four prototype designs (Fig. 6C). While all participants ranked Prototype D as their least preferred choice, many citing that the images used were too distracting, clinicians and non-clinicians varied in their ranking of the other three options, with clinicians preferring option A and non-clinicians preferring B. However, qualitative feedback collected for this question revealed that participants found comparing individual elements easier than comparing full reports.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Qualitative_data_affords_additional_insights_into_report_design\">Qualitative data affords additional insights into report design<\/span><\/h3>\n<p>The qualitative responses in the design choice questionnaire raised important points that would otherwise not have been captured by quantitative data alone. For example, the importance of presenting drug susceptibility data clearly emerged from the qualitative responses. Participants indicated that \u201cthe report must call attention [to] drug resistance\u201d and expressed concern that the abbreviation of drug names and\/or predicted resistance phenotype could lead to misinterpretation and pose risks to patient safety, stating that \u201cnot all clinicians [are] likely to recognize the abbreviations\u201d and \u201c[using the full name] reduces the risk of errors, especially if new to TB\u201d. When choosing how to emphasize predicted drug susceptibility information (shading, bolding, alert glyphs, or no emphasis), some participants suggested shading draws the quickest attention to [resistance]\u201d and that \u201cwith presbyopia, resistance can be easily missed and therefore shading affords greater patient safety\u201d, but other participants indicated drug susceptibility, rather than resistance, should be emphasized: \u201cnot sure that resistant should be shaded\u2014better to shade sensitive drugs in my view\u201d and \u201cit would be better to highlight what is working instead of highlight what is not working.\u201d We opted to highlight resistance given the low incidence of drug-resistant TB in the U.K. and Canada, which were the primary application contexts. Some reported concerns as to whether such emphasis was possible with current electronic health records, including \u201c[bolding or shading] may not transfer correctly\u201d and \u201cshaded [text] won\u2019t photocopy well,\u201d which prompted us to test both printing and photocopying of the resulting report.\n<\/p><p>The issue of clinicians having little time to interact with the report, raised in both the expert consults and the task and data questionnaire, also became apparent in the qualitative responses to the design choice questionnaire, such as \u201cthe best likelihood of success will [come] from the ability to draw attention to someone scanning the document quickly.\u201d However, participants\u2019 perceptions of which design choices best promoted rapid synthesis varied. Some preferred summaries in the form of check boxes \u2014 \u201c[a] tick box is the most straightforward way to summarize it. Reading a summary sentence will probably take longer\u201d and \u201cthe check boxes provide an at-a-glance result\u201d \u2014 while others preferred additional commentary: \u201cinterpretation is important; but tick boxes alone lack the necessary nuance required for interpretation\u201d and that \u201ctick boxes may cause confusion when clinicians read XDR without realizing that option is not selected. Ideal to add a comment about resistance\u201d. To address this concern we added a \u201cNo drug resistance predicted\u201d option to the check-boxes (absent from the survey design options), and included shading elements to emphasize the drug susceptibility result.\n<\/p><p>The qualitative responses to Q17 (Fig. 6B) provided further insight into the uncertainty around how best to represent genomic relatedness suggestive of an epidemiological relatedness. Some participants felt that data related to surveillance tasks should not appear in a report that is also meant for clinicians, either because it was not relevant to this audience \u2014 \u201c[this data] should not appear in the report. It should only be given to field epi and researchers. Overloading the clinical report would be deteriorating\u201d and \u201cnot useful for a clinician\u201d \u2014 or because they were uncertain about its interpretation: \u201ccluster detection would be fine for those who already know what a cluster is\u201d and \u201cmy patient\u2019s isolate is 6 SNPs from someone diagnosed 3 years ago. What is the clinical action?\u201d\n<\/p><p>Of the design choices for cluster detection, several participants articulated that many of the options, including the control, \u201c[included] too much information and [were] unnecessary for routine diagnosis\/treatment\u201d. However, others felt that the options did not provide sufficient detail and offered alternatives, such as \u201cif you can combine the phylogenetic tree with some kind of graph showing temporal spread that would be perfect. Adding geographical data would be a really helpful bonus too\u201d. This is an area of reporting that requires further investigation and was not fully resolved in our study.\n<\/p><p>Finally, participants were candid about those design options that did not work well. For example, of the report design with many graphics (Fig. 6A, option D), participants indicated it was \u201cdistracting; looks like a set of roadworks rather than a microbiology report\u201d and that it was important to \u201ckeep it simple\u201d. Their feedback also revealed when our phrasing on the survey instruments was unclear.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Developing_a_final_report_template\">Developing a final report template<\/span><\/h3>\n<p>There are no prescriptive guidelines around integrating our quantitative data, qualitative data, and ISO 15189:2012 reporting requirements; thus, we have attempted to be as transparent and empiric as possible in justifying our final design (Fig. 7). A more thorough walkthrough is presented in the Fig. S1 and here we highlight selected choices. The final prototype is implemented in LaTeX and is available online as a template accessible at <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.cs.ubc.ca\/labs\/imager\/tr\/2017\/MicroReportDesign\/\" target=\"_blank\">http:\/\/www.cs.ubc.ca\/labs\/imager\/tr\/2017\/MicroReportDesign\/<\/a>.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig7_Crisan_PeerJ2018_6.jpg\" class=\"image wiki-link\" target=\"_blank\" data-key=\"28c52d07932e365e001c2379572824b0\"><img alt=\"Fig7 Crisan PeerJ2018 6.jpg\" src=\"https:\/\/www.limswiki.org\/images\/c\/c4\/Fig7_Crisan_PeerJ2018_6.jpg\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Fig. 7<\/b> Original (A) and revised reports (B). The revised report uses empirical evidence gathered through multiple stages of a human centered design process. Note that the image in the upper corner of the revised report is a placeholder for an organizational logo.<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>We first incorporated ISO 15189:2012 requirements (see Fig. S1) into the final report template and then turned to the preferences expressed in the design choice questionnaire. Overall, information was structured to mirror the TB workflow: diagnosis, treatment, then surveillance. We chose to limit bolding to relevant information and used shading to highlight important and actionable clinical information, under the rationale that appropriate use of emphasis could facilitate an accurate and quick reading of the report, with detailed information present but de-emphasized.\n<\/p><p>In two instances, our design decisions deviated from participant preferences: we opted to use one column instead of two, and we presented detailed genomic resistance data on the first page of the report, rather than the second page. A single column was chosen as all of the information ranked as important by participants could be presented on a single page without the need to condense information into two columns. Because many of the resistotype details of the original report, such as mutation source and individual nucleotide changes (Fig. 1), were not included in the revised report, it was possible to present all of the participants\u2019 desired data in a single table on one page.\n<\/p><p>A draft of the final design was presented to a new cohort of TB stakeholders at a September 2017 expert working group on standardized reporting of TB genomic resistance data. Through a group discussion, subtle changes to the report were made, including updating some of the language used (for example, replacing occurrences of the word \u201csensitive\u201d with \u201csusceptible\u201d), adding the lineage to the organism section, and adding additional fields to tables describing the sample, and the assay, such as what type of material was sequenced (pure culture, direct specimen) and what sequencing platform was used.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Discussion\">Discussion<\/span><\/h2>\n<p>Microbial genomics is playing an increasingly important role in public health microbiology, and its successful implementation in the clinic will rely not just on validation and accreditation of WGS-based tests, but also in how effective the resulting reports are to stakeholders, including clinicians. Using design study methodology, we developed a two-page report template to communicate WGS-derived test results related to TB diagnosis, drug susceptibility testing, and clustering.\n<\/p><p>To our knowledge, this project is the first formal inquiry into human-centered design for microbial genomics reporting. We argue that the application of human-centered design methodologies allowed us to improve not only the visual aesthetics of the final report, but also its functionality, by carefully coupling stakeholder tasks, data, and constraints to techniques from information and graphic design. Giving the original report, a \u201cgraphic design facelift\u201d would not have improved the functionality, as some of the information in the original report was found to be unnecessary, presented in a way that could lead to misinterpretation, or did not take into account stakeholder constraints. For example, interviews and surveys revealed procedural and data constraints our study team had not anticipated, including the limited time available for clinicians to read laboratory reports and the need for simple, black-and-white formatting amenable to media ranging from electronic delivery to fax; these findings were critical to shaping the downstream design process. Furthermore, in nearly every case, study participants preferred our alternative design elements, informed by empirical findings in the discovery stage, over the control elements derived from the original report. Our approach also suggested that some participants are not confident in their ability to interpret certain types of genomic data. As WGS moves towards routine clinical use, it is clear that successful implementation of genomic assays will also require complementary education and training opportunities for those individuals regularly interacting with WGS-derived data.\n<\/p><p>Although human-centered information visualization design methodologies are commonly used in software development, it could be asked whether they are warranted in a report design project. One advantage of tackling the simpler problem of report design is that it allows us to demonstrate design study methodology in action and link evidence to design decisions more clearly than with a software product. We also collected data with the intention of applying it to the development and evaluation of more complex reporting and data visualization software that we plan to create. Similarly, others can use our approach or our data to inform the design of simple or complex applications elsewhere in pathogen genomics and bioinformatics.\n<\/p><p>The exploratory nature of this project brings with it certain limitations. First, our participants were identified through convenience and snowball sampling within the authors\u2019 networks and thus are likely to be more experienced with the clinical application of microbial genomics. While this is appropriate for the context of our collaboration, in which our goal is redesigning a report for use by the COMPASS-TB team and collaborating laboratories, it does limit our ability to generalize the findings to other settings. WGS is only used routinely in a small number of laboratories, and even if its reach were larger, these may be settings where English is not the first language used in reporting clinical results, or where written text is read in different ways, both of which would affect our design choices. Second, we did not have <i>a priori<\/i> knowledge of the effect sizes (i.e., extent of preferential difference for each type of question) in the design choice questionnaire, making sample size calculations challenging. Had <i>a priori<\/i> effect sizes been available, the study could be powered, for example, for the smallest or average effect size. To avoid mischaracterizing our results, we have relied on primarily descriptive statistics, without tests for statistical significance, and assert that our findings are best interpreted as first steps toward a better understanding how information and visualization design can play a role in reporting pathogen WGS data. However, when confidence intervals were calculated for the results of the design choice questionnaire, we observed that non-overlapping confidence intervals separated user preferences as well as the deviation from a random score metric that we primarily used in our analysis. We argue the latter is a useful measure for exploratory studies without clear <i>a priori<\/i> knowledge of effect sizes for proper sample size calculations. Finally, we did not undertake a head-to-head experimental comparison between the original report design and the revised design. While this comparison had been planned at the outset of our project, the results of the design choice questionnaire showed such a clear preference for the alternative designs when comparing isolated components that we concluded there was no need for such a final test as it would yield little new evidence.\n<\/p><p>For researchers wishing to undertake a similar human-centered design approach, we have summarized our primary findings into three experimental guidelines and five design guidelines. These guidelines arose from our experience throughout this report redesign process but are intended to apply generally to the process of designing visualizations for microbial genomic data or other human health-related information.\n<\/p><p>The three experimental guidelines reflect the areas of the design methodology that we found to be particularly important in our data collection and analysis as well as the final report design process. First, design around tasks. It is tempting to simply ask stakeholders what they want to see in a final design, but many of them will not be able to create an effective end product because design is not their principal area of expertise. However, stakeholders know very well what they do on a daily basis and can indicate data that are relevant to those specific tasks and can indicate in which areas they require more support. The role of the designer is to marry those tasks, clinical workflows, and constraints into design alternatives. Depending on the tasks and context, many design alternatives might be possible, making use of color, more complex visualizations, or interactivity. In other situations, such as the one presented here, design constraints limit the range of prototypes that can be generated. \n<\/p><p>Second, compare isolated components, and not just whole systems. Here we use \"system\" to mean either a simple report or a more complex software system. Comparing whole systems can overload an individual\u2019s working memory, meaning they may rely on heuristics such as preferences around style or distracting elements, when assessing and comparing full systems.<sup id=\"rdp-ebb-cite_ref-ShahHeuristics08_26-0\" class=\"reference\"><a href=\"#cite_note-ShahHeuristics08-26\" rel=\"external_link\">[26]<\/a><\/sup> Presenting isolated design elements and controlling for non-tested factors (i.e., font, text) can reduce the burden on working memory and isolate the effect of design alternatives. \n<\/p><p>Finally, compare against a control whenever possible. If a prior report or system exists, or if there are commonly agreed upon conventions in the literature or field, it is useful to compare novel designs against an existing one. More generally, comparison of multiple alternatives is the most critical defense against defaulting to <i>ad hoc<\/i> designs and the most important step of our human-centered design methodology.\n<\/p><p>Our five design guidelines reflect techniques from information visualization and graphic design that we used in an attempt to improve the readability of the report and balance different stakeholder information needs. First, structure information such that it mimics a stakeholder\u2019s workflow. In this case, the report prioritizes a clinical workflow, and this workflow is reflected in the report\u2019s design through the use of gestalt principles<sup id=\"rdp-ebb-cite_ref-MooreUsing93_27-0\" class=\"reference\"><a href=\"#cite_note-MooreUsing93-27\" rel=\"external_link\">[27]<\/a><\/sup>, treating the whole as greater than the sum of its parts. Specifically, we group related data and order information hierarchically so that the document is read according to the clinical narrative we established in the discovery phase. Second, use emphasis carefully. Here, bolding, text size, and shading were reserved to highlight important data and were not applied to aesthetic aspects of the report design. Third, present dense information in a careful and structured manner. Stakeholders should not have to search for relevant information, a cognitively expensive task<sup id=\"rdp-ebb-cite_ref-ChangTheEff12_28-0\" class=\"reference\"><a href=\"#cite_note-ChangTheEff12-28\" rel=\"external_link\">[28]<\/a><\/sup> that can result in information loss.<sup id=\"rdp-ebb-cite_ref-ShneidermanTheEyes96_29-0\" class=\"reference\"><a href=\"#cite_note-ShneidermanTheEyes96-29\" rel=\"external_link\">[29]<\/a><\/sup> Through the combination of gestalt, visual hierarchy, and careful use of emphasis, it is possible to present a lot of information by creating two layers: a higher-level \u201cquick glance\u201d layer and a more detailed lower layer. The quick glance layer should contain the relevant and clinically actionable information and should be visually salient (i.e., \u201cpop-out\u201d), while the detailed layer should be less visually salient and contain additional information that some, but not all, stakeholders may wish to have (based on their tasks and data needs). Fourth, use words precisely. Specific terminology may not be uniformly understood or consistently interpreted by stakeholders, particularly when the designer and the stakeholders come from different domains, or even when individuals in the same domain have markedly different daily workflows, such as bioinformaticians and clinicians. Finally, if using images, do so judiciously. Images can be distracting when they do not convey actionable information relevant to the stakeholder.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Conclusions\">Conclusions<\/span><\/h2>\n<p>We applied human-centered design methodologies to redesign a clinical report for a reference microbiology laboratory, but the techniques we used \u2014 drawn from more complex applications in information visualization and human\u2013computer interaction \u2014 can be used in other scenarios, including the development of more complex data dashboards, data visualization or other bioinformatics tools. By introducing these techniques to the microbial genomics, bioinformatics, and genomic epidemiology communities, we hope to inspire their further use of evidence-based, human-centric design.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Supplemental_information\">Supplemental information<\/span><\/h2>\n<p>Supplemental materials <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/dfzljdn9uc3pi.cloudfront.net\/2018\/4218\/1\/Supplement_Text_v3.pdf\" target=\"_blank\">found here<\/a> (PDF).\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Additional_information_and_declarations\">Additional information and declarations<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"Competing_interests\">Competing interests<\/span><\/h3>\n<p>The authors declare there are no competing interests.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Author_contributions\">Author contributions<\/span><\/h3>\n<p>Anamaria Crisan and Geoffrey McKee conceived and designed the experiments, performed the experiments, analyzed the data, contributed reagents\/materials\/analysis tools, wrote the paper, prepared figures and\/or tables, reviewed drafts of the paper. Tamara Munzner conceived and designed the experiments, analyzed the data, wrote the paper, reviewed drafts of the paper. Jennifer L. Gardy conceived and designed the experiments, performed the experiments, analyzed the data, wrote the paper, reviewed drafts of the paper.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Human_ethics\">Human ethics<\/span><\/h3>\n<p>Our research was reviewed and approved by the University of British Columbia\u2019s Behavioural Research Ethics Board (H10-03336).\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Data_aAvailability\">Data aAvailability<\/span><\/h3>\n<p>Data + Analysis Repository: <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/github.com\/amcrisan\/TBReportRedesign\" target=\"_blank\">https:\/\/github.com\/amcrisan\/TBReportRedesign<\/a>\n<\/p><p>Report Example Repository: <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/github.com\/amcrisan\/TB-WGS-MicroReport\" target=\"_blank\">https:\/\/github.com\/amcrisan\/TB-WGS-MicroReport<\/a>\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Funding\">Funding<\/span><\/h3>\n<p>Anamaria Crisan is supported by the Vanier Scholars Program, Dr. Jennifer L. Gardy is supported by the Canada Research Chairs Program and the Michael Smith Foundation for Health Research Scholar Award Program, Dr. Tamara Munzner is supported by the Natural Sciences and Engineering Research Council of Canada Discovery Program RGPIN-2014-06309. The work described here was funded by the British Columbia Centre for Disease Control Foundation for Population and Public Health, as well as Genome BC, through grant G01SMA: Sharing Mycobacterial Analytic Capacity. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Acknowledgements\">Acknowledgements<\/span><\/h2>\n<p>We would like to acknowledge and thank all study participants who took the time to respond to the surveys and provided excellent and valuable insights for our work. We would also like to thank Ana Gibertoni-Cruz, Grace Smith, and Tim Walker for their contributions in the early stages of this project and in recruiting participants, and the members of the information visualization group at the University of British Columbia who participated in our design sprint: Kimberly Dextras-Romagnino, Dylan Dong, Georges Hattab, and Zipeng Liu.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"References\">References<\/span><\/h2>\n<div class=\"reflist references-column-width\" style=\"-moz-column-width: 30em; -webkit-column-width: 30em; column-width: 30em; list-style-type: decimal;\">\n<ol class=\"references\">\n<li id=\"cite_note-FukuiMeta15-1\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-FukuiMeta15_1-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Fukui, Y.; Aoki, K.; Okuma, S. et al.. \"Metagenomic analysis for detecting pathogens in culture-negative infective endocarditis\". <i>Journal of Infection and Chemotherapy<\/i> <b>21<\/b> (12): 882\u20134. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.jiac.2015.08.007\" target=\"_blank\">10.1016\/j.jiac.2015.08.007<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/26360016\" target=\"_blank\">26360016<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Metagenomic+analysis+for+detecting+pathogens+in+culture-negative+infective+endocarditis&rft.jtitle=Journal+of+Infection+and+Chemotherapy&rft.aulast=Fukui%2C+Y.%3B+Aoki%2C+K.%3B+Okuma%2C+S.+et+al.&rft.au=Fukui%2C+Y.%3B+Aoki%2C+K.%3B+Okuma%2C+S.+et+al.&rft.volume=21&rft.issue=12&rft.pages=882%E2%80%934&rft_id=info:doi\/10.1016%2Fj.jiac.2015.08.007&rft_id=info:pmid\/26360016&rfr_id=info:sid\/en.wikipedia.org:Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LomanACulture13-2\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-LomanACulture13_2-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Loman, N.J.; Constantinidou, C.; Christner, M. et al.. \"A culture-independent sequence-based metagenomics approach to the investigation of an outbreak of Shiga-toxigenic Escherichia coli O104:H4\". <i>JAMA<\/i> <b>309<\/b> (14): 1502-10. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1001%2Fjama.2013.3231\" target=\"_blank\">10.1001\/jama.2013.3231<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/23571589\" target=\"_blank\">23571589<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+culture-independent+sequence-based+metagenomics+approach+to+the+investigation+of+an+outbreak+of+Shiga-toxigenic+Escherichia+coli+O104%3AH4&rft.jtitle=JAMA&rft.aulast=Loman%2C+N.J.%3B+Constantinidou%2C+C.%3B+Christner%2C+M.+et+al.&rft.au=Loman%2C+N.J.%3B+Constantinidou%2C+C.%3B+Christner%2C+M.+et+al.&rft.volume=309&rft.issue=14&rft.pages=1502-10&rft_id=info:doi\/10.1001%2Fjama.2013.3231&rft_id=info:pmid\/23571589&rfr_id=info:sid\/en.wikipedia.org:Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BradleyRapid15-3\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BradleyRapid15_3-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Bradley, P.; Gordon, N.C.; Walker, T.M. et al.. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4703848\" target=\"_blank\">\"Rapid antibiotic-resistance predictions from genome sequence data for Staphylococcus aureus and Mycobacterium tuberculosis\"<\/a>. <i>Nature Communications<\/i> <b>6<\/b>: 10063. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1038%2Fncomms10063\" target=\"_blank\">10.1038\/ncomms10063<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4703848\/\" target=\"_blank\">PMC4703848<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/26686880\" target=\"_blank\">26686880<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4703848\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4703848<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Rapid+antibiotic-resistance+predictions+from+genome+sequence+data+for+Staphylococcus+aureus+and+Mycobacterium+tuberculosis&rft.jtitle=Nature+Communications&rft.aulast=Bradley%2C+P.%3B+Gordon%2C+N.C.%3B+Walker%2C+T.M.+et+al.&rft.au=Bradley%2C+P.%3B+Gordon%2C+N.C.%3B+Walker%2C+T.M.+et+al.&rft.volume=6&rft.pages=10063&rft_id=info:doi\/10.1038%2Fncomms10063&rft_id=info:pmc\/PMC4703848&rft_id=info:pmid\/26686880&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4703848&rfr_id=info:sid\/en.wikipedia.org:Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-PankhurstRapid16-4\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-PankhurstRapid16_4-0\" rel=\"external_link\">4.0<\/a><\/sup> <sup><a href=\"#cite_ref-PankhurstRapid16_4-1\" rel=\"external_link\">4.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Pankhurst, L.J.; Del Ojo Elias, C.; Votintseva, A.A. et al.. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4698465\" target=\"_blank\">\"Rapid, comprehensive, and affordable mycobacterial diagnosis with whole-genome sequencing: A prospective study\"<\/a>. <i>The Lancet Respiratory Medicine<\/i> <b>4<\/b> (1): 49-58. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2FS2213-2600%2815%2900466-X\" target=\"_blank\">10.1016\/S2213-2600(15)00466-X<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4698465\/\" target=\"_blank\">PMC4698465<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/26669893\" target=\"_blank\">26669893<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4698465\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4698465<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Rapid%2C+comprehensive%2C+and+affordable+mycobacterial+diagnosis+with+whole-genome+sequencing%3A+A+prospective+study&rft.jtitle=The+Lancet+Respiratory+Medicine&rft.aulast=Pankhurst%2C+L.J.%3B+Del+Ojo+Elias%2C+C.%3B+Votintseva%2C+A.A.+et+al.&rft.au=Pankhurst%2C+L.J.%3B+Del+Ojo+Elias%2C+C.%3B+Votintseva%2C+A.A.+et+al.&rft.volume=4&rft.issue=1&rft.pages=49-58&rft_id=info:doi\/10.1016%2FS2213-2600%2815%2900466-X&rft_id=info:pmc\/PMC4698465&rft_id=info:pmid\/26669893&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4698465&rfr_id=info:sid\/en.wikipedia.org:Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WalkerWhole15-5\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WalkerWhole15_5-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Walker, T.M.; Kohl, T.A.; Omar, S.V. et al.. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4579482\" target=\"_blank\">\"Whole-genome sequencing for prediction of Mycobacterium tuberculosis drug susceptibility and resistance: A retrospective cohort study\"<\/a>. <i>The Lancet Infectious Diseases<\/i> <b>15<\/b> (10): 1193\u20131202. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2FS1473-3099%2815%2900062-6\" target=\"_blank\">10.1016\/S1473-3099(15)00062-6<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4579482\/\" target=\"_blank\">PMC4579482<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/26116186\" target=\"_blank\">26116186<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4579482\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4579482<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Whole-genome+sequencing+for+prediction+of+Mycobacterium+tuberculosis+drug+susceptibility+and+resistance%3A+A+retrospective+cohort+study&rft.jtitle=The+Lancet+Infectious+Diseases&rft.aulast=Walker%2C+T.M.%3B+Kohl%2C+T.A.%3B+Omar%2C+S.V.+et+al.&rft.au=Walker%2C+T.M.%3B+Kohl%2C+T.A.%3B+Omar%2C+S.V.+et+al.&rft.volume=15&rft.issue=10&rft.pages=1193%E2%80%931202&rft_id=info:doi\/10.1016%2FS1473-3099%2815%2900062-6&rft_id=info:pmc\/PMC4579482&rft_id=info:pmid\/26116186&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4579482&rfr_id=info:sid\/en.wikipedia.org:Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-NikolayevskyyWhole16-6\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-NikolayevskyyWhole16_6-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Nikolayevskyy, V.; Kranzer, K.; Niemann, S. et al.. \"Whole genome sequencing of Mycobacterium tuberculosis for detection of recent transmission and tracing outbreaks: A systematic review\". <i>Tuberculosis<\/i> <b>98<\/b>: 77-85. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.tube.2016.02.009\" target=\"_blank\">10.1016\/j.tube.2016.02.009<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/27156621\" target=\"_blank\">27156621<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Whole+genome+sequencing+of+Mycobacterium+tuberculosis+for+detection+of+recent+transmission+and+tracing+outbreaks%3A+A+systematic+review&rft.jtitle=Tuberculosis&rft.aulast=Nikolayevskyy%2C+V.%3B+Kranzer%2C+K.%3B+Niemann%2C+S.+et+al.&rft.au=Nikolayevskyy%2C+V.%3B+Kranzer%2C+K.%3B+Niemann%2C+S.+et+al.&rft.volume=98&rft.pages=77-85&rft_id=info:doi\/10.1016%2Fj.tube.2016.02.009&rft_id=info:pmid\/27156621&rfr_id=info:sid\/en.wikipedia.org:Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BudowleValid14-7\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BudowleValid14_7-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Budowle, B.; Connell, N.D.; Bielecka-Oder, A. et al.. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4123828\" target=\"_blank\">\"Validation of high throughput sequencing and microbial forensics applications\"<\/a>. <i>Investigative Genetics<\/i> <b>5<\/b>: 9. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1186%2F2041-2223-5-9\" target=\"_blank\">10.1186\/2041-2223-5-9<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4123828\/\" target=\"_blank\">PMC4123828<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/25101166\" target=\"_blank\">25101166<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4123828\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4123828<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Validation+of+high+throughput+sequencing+and+microbial+forensics+applications&rft.jtitle=Investigative+Genetics&rft.aulast=Budowle%2C+B.%3B+Connell%2C+N.D.%3B+Bielecka-Oder%2C+A.+et+al.&rft.au=Budowle%2C+B.%3B+Connell%2C+N.D.%3B+Bielecka-Oder%2C+A.+et+al.&rft.volume=5&rft.pages=9&rft_id=info:doi\/10.1186%2F2041-2223-5-9&rft_id=info:pmc\/PMC4123828&rft_id=info:pmid\/25101166&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4123828&rfr_id=info:sid\/en.wikipedia.org:Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GargisAssur16-8\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-GargisAssur16_8-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Gargis, A.S.; Kalman, L.; Lubin, I.M.. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5121372\" target=\"_blank\">\"Assuring the Quality of Next-Generation Sequencing in Clinical Microbiology and Public Health Laboratories\"<\/a>. <i>Journal of Clinical Microbiology<\/i> <b>54<\/b> (12): 2857-2865. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1128%2FJCM.00949-16\" target=\"_blank\">10.1128\/JCM.00949-16<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC5121372\/\" target=\"_blank\">PMC5121372<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/27510831\" target=\"_blank\">27510831<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5121372\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5121372<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Assuring+the+Quality+of+Next-Generation+Sequencing+in+Clinical+Microbiology+and+Public+Health+Laboratories&rft.jtitle=Journal+of+Clinical+Microbiology&rft.aulast=Gargis%2C+A.S.%3B+Kalman%2C+L.%3B+Lubin%2C+I.M.&rft.au=Gargis%2C+A.S.%3B+Kalman%2C+L.%3B+Lubin%2C+I.M.&rft.volume=54&rft.issue=12&rft.pages=2857-2865&rft_id=info:doi\/10.1128%2FJCM.00949-16&rft_id=info:pmc\/PMC5121372&rft_id=info:pmid\/27510831&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC5121372&rfr_id=info:sid\/en.wikipedia.org:Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-DriscollInteg11-9\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-DriscollInteg11_9-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Driscoll, T.; Gabbard, J.L.; Mao, C. et al.. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3150046\" target=\"_blank\">\"Integration and visualization of host-pathogen data related to infectious diseases\"<\/a>. <i>Bioinformatics<\/i> <b>27<\/b> (16): 2279-87. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1093%2Fbioinformatics%2Fbtr391\" target=\"_blank\">10.1093\/bioinformatics\/btr391<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3150046\/\" target=\"_blank\">PMC3150046<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/21712250\" target=\"_blank\">21712250<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3150046\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3150046<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Integration+and+visualization+of+host-pathogen+data+related+to+infectious+diseases&rft.jtitle=Bioinformatics&rft.aulast=Driscoll%2C+T.%3B+Gabbard%2C+J.L.%3B+Mao%2C+C.+et+al.&rft.au=Driscoll%2C+T.%3B+Gabbard%2C+J.L.%3B+Mao%2C+C.+et+al.&rft.volume=27&rft.issue=16&rft.pages=2279-87&rft_id=info:doi\/10.1093%2Fbioinformatics%2Fbtr391&rft_id=info:pmc\/PMC3150046&rft_id=info:pmid\/21712250&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3150046&rfr_id=info:sid\/en.wikipedia.org:Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-Argim.C3.B3nMicro16-10\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-Argim.C3.B3nMicro16_10-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Argim\u00f3n, S.; Abudahab, K.; Goater, R.J. et al.. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5320705\" target=\"_blank\">\"Microreact: Visualizing and sharing data for genomic epidemiology and phylogeography\"<\/a>. <i>Microbial Genomics<\/i> <b>2<\/b> (11): e000093. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1099%2Fmgen.0.000093\" target=\"_blank\">10.1099\/mgen.0.000093<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC5320705\/\" target=\"_blank\">PMC5320705<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/28348833\" target=\"_blank\">28348833<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5320705\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5320705<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Microreact%3A+Visualizing+and+sharing+data+for+genomic+epidemiology+and+phylogeography&rft.jtitle=Microbial+Genomics&rft.aulast=Argim%C3%B3n%2C+S.%3B+Abudahab%2C+K.%3B+Goater%2C+R.J.+et+al.&rft.au=Argim%C3%B3n%2C+S.%3B+Abudahab%2C+K.%3B+Goater%2C+R.J.+et+al.&rft.volume=2&rft.issue=11&rft.pages=e000093&rft_id=info:doi\/10.1099%2Fmgen.0.000093&rft_id=info:pmc\/PMC5320705&rft_id=info:pmid\/28348833&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC5320705&rfr_id=info:sid\/en.wikipedia.org:Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CarrollVisual14-11\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CarrollVisual14_11-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Carroll, L.N.; Au, A.P.; Detwiler, L.T. et al.. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5734643\" target=\"_blank\">\"Visualization and analytics tools for infectious disease epidemiology: A systematic review\"<\/a>. <i>Journal of Biomedical Informatics<\/i> <b>51<\/b>: 287-98. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.jbi.2014.04.006\" target=\"_blank\">10.1016\/j.jbi.2014.04.006<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC5734643\/\" target=\"_blank\">PMC5734643<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/24747356\" target=\"_blank\">24747356<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5734643\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5734643<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Visualization+and+analytics+tools+for+infectious+disease+epidemiology%3A+A+systematic+review&rft.jtitle=Journal+of+Biomedical+Informatics&rft.aulast=Carroll%2C+L.N.%3B+Au%2C+A.P.%3B+Detwiler%2C+L.T.+et+al.&rft.au=Carroll%2C+L.N.%3B+Au%2C+A.P.%3B+Detwiler%2C+L.T.+et+al.&rft.volume=51&rft.pages=287-98&rft_id=info:doi\/10.1016%2Fj.jbi.2014.04.006&rft_id=info:pmc\/PMC5734643&rft_id=info:pmid\/24747356&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC5734643&rfr_id=info:sid\/en.wikipedia.org:Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LeslieStandard94-12\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-LeslieStandard94_12-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Leslie, K.O.; Rosai, J.. \"Standardization of the surgical pathology report: Formats, templates, and synoptic reports\". <i>Seminars in Diagnostic Pathology<\/i> <b>11<\/b> (4): 253-7. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/7878300\" target=\"_blank\">7878300<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Standardization+of+the+surgical+pathology+report%3A+Formats%2C+templates%2C+and+synoptic+reports&rft.jtitle=Seminars+in+Diagnostic+Pathology&rft.aulast=Leslie%2C+K.O.%3B+Rosai%2C+J.&rft.au=Leslie%2C+K.O.%3B+Rosai%2C+J.&rft.volume=11&rft.issue=4&rft.pages=253-7&rft_id=info:pmid\/7878300&rfr_id=info:sid\/en.wikipedia.org:Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ValensteinFormat08-13\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ValensteinFormat08_13-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Valenstein, P.N.. \"Formatting pathology reports: Applying four design principles to improve communication and patient safety\". <i>Archives of Pathology and Laboratory Medicine<\/i> <b>132<\/b> (1): 84-94. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/0.1043%2F1543-2165%282008%29132%5B84%3AFPRAFD%5D2.0.CO%3B2\" target=\"_blank\">0.1043\/1543-2165(2008)132[84:FPRAFD]2.0.CO;2<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/18181680\" target=\"_blank\">18181680<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Formatting+pathology+reports%3A+Applying+four+design+principles+to+improve+communication+and+patient+safety&rft.jtitle=Archives+of+Pathology+and+Laboratory+Medicine&rft.aulast=Valenstein%2C+P.N.&rft.au=Valenstein%2C+P.N.&rft.volume=132&rft.issue=1&rft.pages=84-94&rft_id=info:doi\/0.1043%2F1543-2165%282008%29132%5B84%3AFPRAFD%5D2.0.CO%3B2&rft_id=info:pmid\/18181680&rfr_id=info:sid\/en.wikipedia.org:Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-RenshawTheImpact14-14\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-RenshawTheImpact14_14-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Renshaw, S.A.; Mena-Allauca, M.; Touriz, M. et al.. \"The impact of template format on the completeness of surgical pathology reports\". <i>Archives of Pathology and Laboratory Medicine<\/i> <b>138<\/b> (1): 121-4. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.5858%2Farpa.2012-0733-OA\" target=\"_blank\">10.5858\/arpa.2012-0733-OA<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/24377820\" target=\"_blank\">24377820<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+impact+of+template+format+on+the+completeness+of+surgical+pathology+reports&rft.jtitle=Archives+of+Pathology+and+Laboratory+Medicine&rft.aulast=Renshaw%2C+S.A.%3B+Mena-Allauca%2C+M.%3B+Touriz%2C+M.+et+al.&rft.au=Renshaw%2C+S.A.%3B+Mena-Allauca%2C+M.%3B+Touriz%2C+M.+et+al.&rft.volume=138&rft.issue=1&rft.pages=121-4&rft_id=info:doi\/10.5858%2Farpa.2012-0733-OA&rft_id=info:pmid\/24377820&rfr_id=info:sid\/en.wikipedia.org:Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-NygrenHelping98-15\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-NygrenHelping98_15-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Nygren, E.; Wyatt, J.C.; Wright, P.. \"Helping clinicians to find data and avoid delays\". <i>Lancet<\/i> <b>352<\/b> (9138): 1462-6. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2FS0140-6736%2897%2908307-4\" target=\"_blank\">10.1016\/S0140-6736(97)08307-4<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/9808009\" target=\"_blank\">9808009<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Helping+clinicians+to+find+data+and+avoid+delays&rft.jtitle=Lancet&rft.aulast=Nygren%2C+E.%3B+Wyatt%2C+J.C.%3B+Wright%2C+P.&rft.au=Nygren%2C+E.%3B+Wyatt%2C+J.C.%3B+Wright%2C+P.&rft.volume=352&rft.issue=9138&rft.pages=1462-6&rft_id=info:doi\/10.1016%2FS0140-6736%2897%2908307-4&rft_id=info:pmid\/9808009&rfr_id=info:sid\/en.wikipedia.org:Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WrightHow98-16\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-WrightHow98_16-0\" rel=\"external_link\">16.0<\/a><\/sup> <sup><a href=\"#cite_ref-WrightHow98_16-1\" rel=\"external_link\">16.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Wright, P.; Jansen, C.; Wyatt, J.C.. \"How to limit clinical errors in interpretation of data\". <i>Lancet<\/i> <b>352<\/b> (9139): 1539-43. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2FS0140-6736%2898%2908308-1\" target=\"_blank\">10.1016\/S0140-6736(98)08308-1<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/9820319\" target=\"_blank\">9820319<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=How+to+limit+clinical+errors+in+interpretation+of+data&rft.jtitle=Lancet&rft.aulast=Wright%2C+P.%3B+Jansen%2C+C.%3B+Wyatt%2C+J.C.&rft.au=Wright%2C+P.%3B+Jansen%2C+C.%3B+Wyatt%2C+J.C.&rft.volume=352&rft.issue=9139&rft.pages=1539-43&rft_id=info:doi\/10.1016%2FS0140-6736%2898%2908308-1&rft_id=info:pmid\/9820319&rfr_id=info:sid\/en.wikipedia.org:Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ZipkinEvidence14-17\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ZipkinEvidence14_17-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Zipkin, D.A.; Umscheid, C.A.; Keating, N.L. et al.. \"Evidence-based risk communication: A systematic review\". <i>Annals of Internal Medicine<\/i> <b>161<\/b> (4): 270-80. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.7326%2FM14-0295\" target=\"_blank\">10.7326\/M14-0295<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/25133362\" target=\"_blank\">25133362<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Evidence-based+risk+communication%3A+A+systematic+review&rft.jtitle=Annals+of+Internal+Medicine&rft.aulast=Zipkin%2C+D.A.%3B+Umscheid%2C+C.A.%3B+Keating%2C+N.L.+et+al.&rft.au=Zipkin%2C+D.A.%3B+Umscheid%2C+C.A.%3B+Keating%2C+N.L.+et+al.&rft.volume=161&rft.issue=4&rft.pages=270-80&rft_id=info:doi\/10.7326%2FM14-0295&rft_id=info:pmid\/25133362&rfr_id=info:sid\/en.wikipedia.org:Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-AnckerDesign06-18\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-AnckerDesign06_18-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Ancker, J.S.; Senathirajah, Y.; Kukafka, R. et al.. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC1656964\" target=\"_blank\">\"Design features of graphs in health risk communication: A systematic review\"<\/a>. <i>JAMIA<\/i> <b>13<\/b> (6): 608\u201318. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1197%2Fjamia.M2115\" target=\"_blank\">10.1197\/jamia.M2115<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC1656964\/\" target=\"_blank\">PMC1656964<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/16929039\" target=\"_blank\">16929039<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC1656964\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC1656964<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Design+features+of+graphs+in+health+risk+communication%3A+A+systematic+review&rft.jtitle=JAMIA&rft.aulast=Ancker%2C+J.S.%3B+Senathirajah%2C+Y.%3B+Kukafka%2C+R.+et+al.&rft.au=Ancker%2C+J.S.%3B+Senathirajah%2C+Y.%3B+Kukafka%2C+R.+et+al.&rft.volume=13&rft.issue=6&rft.pages=608%E2%80%9318&rft_id=info:doi\/10.1197%2Fjamia.M2115&rft_id=info:pmc\/PMC1656964&rft_id=info:pmid\/16929039&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC1656964&rfr_id=info:sid\/en.wikipedia.org:Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HettingerCog17-19\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HettingerCog17_19-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Hettinger, A.Z.; Roth, E.M.; Bisantz, A.M.. \"Cognitive engineering and health informatics: Applications and intersections\". <i>Journal of Biomedical Informatics<\/i> <b>67<\/b>: 21\u201333. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.jbi.2017.01.010\" target=\"_blank\">10.1016\/j.jbi.2017.01.010<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/28126605\" target=\"_blank\">28126605<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Cognitive+engineering+and+health+informatics%3A+Applications+and+intersections&rft.jtitle=Journal+of+Biomedical+Informatics&rft.aulast=Hettinger%2C+A.Z.%3B+Roth%2C+E.M.%3B+Bisantz%2C+A.M.&rft.au=Hettinger%2C+A.Z.%3B+Roth%2C+E.M.%3B+Bisantz%2C+A.M.&rft.volume=67&rft.pages=21%E2%80%9333&rft_id=info:doi\/10.1016%2Fj.jbi.2017.01.010&rft_id=info:pmid\/28126605&rfr_id=info:sid\/en.wikipedia.org:Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HorskyInterface12-20\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HorskyInterface12_20-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Horsky, J.; Schiff, G.D.; Johnston, D. et al.. \"Interface design principles for usable decision support: A targeted review of best practices for clinical prescribing interventions\". <i>Journal of Biomedical Informatics<\/i> <b>45<\/b> (6): 1202-16. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.jbi.2012.09.002\" target=\"_blank\">10.1016\/j.jbi.2012.09.002<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/22995208\" target=\"_blank\">22995208<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Interface+design+principles+for+usable+decision+support%3A+A+targeted+review+of+best+practices+for+clinical+prescribing+interventions&rft.jtitle=Journal+of+Biomedical+Informatics&rft.aulast=Horsky%2C+J.%3B+Schiff%2C+G.D.%3B+Johnston%2C+D.+et+al.&rft.au=Horsky%2C+J.%3B+Schiff%2C+G.D.%3B+Johnston%2C+D.+et+al.&rft.volume=45&rft.issue=6&rft.pages=1202-16&rft_id=info:doi\/10.1016%2Fj.jbi.2012.09.002&rft_id=info:pmid\/22995208&rfr_id=info:sid\/en.wikipedia.org:Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-PHETuber16-21\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-PHETuber16_21-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.gov.uk\/government\/uploads\/system\/uploads\/attachment_data\/file\/654294\/TB_Annual_Report_2016_GTW2309_errata_v1.2.pdf\" target=\"_blank\">\"Tuberculosis in England: 2016 Report\"<\/a> (PDF). Public Health England. September 2016<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.gov.uk\/government\/uploads\/system\/uploads\/attachment_data\/file\/654294\/TB_Annual_Report_2016_GTW2309_errata_v1.2.pdf\" target=\"_blank\">https:\/\/www.gov.uk\/government\/uploads\/system\/uploads\/attachment_data\/file\/654294\/TB_Annual_Report_2016_GTW2309_errata_v1.2.pdf<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Tuberculosis+in+England%3A+2016+Report&rft.atitle=&rft.date=September+2016&rft.pub=Public+Health+England&rft_id=https%3A%2F%2Fwww.gov.uk%2Fgovernment%2Fuploads%2Fsystem%2Fuploads%2Fattachment_data%2Ffile%2F654294%2FTB_Annual_Report_2016_GTW2309_errata_v1.2.pdf&rfr_id=info:sid\/en.wikipedia.org:Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-SedlmairDesign12-22\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-SedlmairDesign12_22-0\" rel=\"external_link\">22.0<\/a><\/sup> <sup><a href=\"#cite_ref-SedlmairDesign12_22-1\" rel=\"external_link\">22.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Sedlmair, M.; Meyer, M.; Munzner, T. et al.. \"Design Study Methodology: Reflections from the Trenches and the Stacks\". <i>IEEE Transactions on Visualization and Computer Graphics<\/i> <b>18<\/b> (12): 2431-40. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FTVCG.2012.213\" target=\"_blank\">10.1109\/TVCG.2012.213<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/26357151\" target=\"_blank\">26357151<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Design+Study+Methodology%3A+Reflections+from+the+Trenches+and+the+Stacks&rft.jtitle=IEEE+Transactions+on+Visualization+and+Computer+Graphics&rft.aulast=Sedlmair%2C+M.%3B+Meyer%2C+M.%3B+Munzner%2C+T.+et+al.&rft.au=Sedlmair%2C+M.%3B+Meyer%2C+M.%3B+Munzner%2C+T.+et+al.&rft.volume=18&rft.issue=12&rft.pages=2431-40&rft_id=info:doi\/10.1109%2FTVCG.2012.213&rft_id=info:pmid\/26357151&rfr_id=info:sid\/en.wikipedia.org:Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CreswellResearch14-23\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-CreswellResearch14_23-0\" rel=\"external_link\">23.0<\/a><\/sup> <sup><a href=\"#cite_ref-CreswellResearch14_23-1\" rel=\"external_link\">23.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation book\">Creswell, J.W. (2014). <i>Research Design: Qualitative, Quantitative and Mixed Methods Approaches<\/i> (4th ed.). Sage Publications. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a> 9781452226101.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=book&rft.btitle=Research+Design%3A+Qualitative%2C+Quantitative+and+Mixed+Methods+Approaches&rft.aulast=Creswell%2C+J.W.&rft.au=Creswell%2C+J.W.&rft.date=2014&rft.edition=4th&rft.pub=Sage+Publications&rft.isbn=9781452226101&rfr_id=info:sid\/en.wikipedia.org:Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LloydHuman11-24\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-LloydHuman11_24-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Lloyd, D.; Dykes, J.. \"Human-centered approaches in geovisualization design: Investigating multiple methods through a long-term case study\". <i>IEEE Transactions on Visualization and Computer Graphics<\/i> <b>17<\/b> (12): 2498-507. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FTVCG.2011.209\" target=\"_blank\">10.1109\/TVCG.2011.209<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/22034371\" target=\"_blank\">22034371<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Human-centered+approaches+in+geovisualization+design%3A+Investigating+multiple+methods+through+a+long-term+case+study&rft.jtitle=IEEE+Transactions+on+Visualization+and+Computer+Graphics&rft.aulast=Lloyd%2C+D.%3B+Dykes%2C+J.&rft.au=Lloyd%2C+D.%3B+Dykes%2C+J.&rft.volume=17&rft.issue=12&rft.pages=2498-507&rft_id=info:doi\/10.1109%2FTVCG.2011.209&rft_id=info:pmid\/22034371&rfr_id=info:sid\/en.wikipedia.org:Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-VradenburgASurv02-25\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-VradenburgASurv02_25-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Vradenburg, K.; Mao, J.-Y.; Smith, P.W.; Carey, T.. \"A survey of user-centered design practice\". <i>Proceedings of the SIGCHI Conference on Human Factors in Computing Systems<\/i> <b>2002<\/b>: 471-478. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1145%2F503376.503460\" target=\"_blank\">10.1145\/503376.503460<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+survey+of+user-centered+design+practice&rft.jtitle=Proceedings+of+the+SIGCHI+Conference+on+Human+Factors+in+Computing+Systems&rft.aulast=Vradenburg%2C+K.%3B+Mao%2C+J.-Y.%3B+Smith%2C+P.W.%3B+Carey%2C+T.&rft.au=Vradenburg%2C+K.%3B+Mao%2C+J.-Y.%3B+Smith%2C+P.W.%3B+Carey%2C+T.&rft.volume=2002&rft.pages=471-478&rft_id=info:doi\/10.1145%2F503376.503460&rfr_id=info:sid\/en.wikipedia.org:Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ShahHeuristics08-26\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ShahHeuristics08_26-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Shah, A.K.; Oppenheimer, D.M. (2008). \"Heuristics made easy: An effort-reduction framework\". <i>Psychological Bulletin<\/i> <b>134<\/b> (2): 207\u201322. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1037%2F0033-2909.134.2.207\" target=\"_blank\">10.1037\/0033-2909.134.2.207<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/18298269\" target=\"_blank\">18298269<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Heuristics+made+easy%3A+An+effort-reduction+framework&rft.jtitle=Psychological+Bulletin&rft.aulast=Shah%2C+A.K.%3B+Oppenheimer%2C+D.M.&rft.au=Shah%2C+A.K.%3B+Oppenheimer%2C+D.M.&rft.date=2008&rft.volume=134&rft.issue=2&rft.pages=207%E2%80%9322&rft_id=info:doi\/10.1037%2F0033-2909.134.2.207&rft_id=info:pmid\/18298269&rfr_id=info:sid\/en.wikipedia.org:Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MooreUsing93-27\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-MooreUsing93_27-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Moore, P.; Fitz, C.. \"Using Gestalt theory to teach document design and graphics\". <i>Technical Communication Quarterly<\/i> <b>2<\/b> (4): 389\u2013410. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1080%2F10572259309364549\" target=\"_blank\">10.1080\/10572259309364549<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Using+Gestalt+theory+to+teach+document+design+and+graphics&rft.jtitle=Technical+Communication+Quarterly&rft.aulast=Moore%2C+P.%3B+Fitz%2C+C.&rft.au=Moore%2C+P.%3B+Fitz%2C+C.&rft.volume=2&rft.issue=4&rft.pages=389%E2%80%93410&rft_id=info:doi\/10.1080%2F10572259309364549&rfr_id=info:sid\/en.wikipedia.org:Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ChangTheEff12-28\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ChangTheEff12_28-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Chang, T.-W.; Kinshuk; Chen, N.-S-.; Yu, P.-T. (2012). \"The effects of presentation method and information density on visual search ability and working memory load\". <i>Computers and Education<\/i> <b>58<\/b> (2): 721-731. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.compedu.2011.09.022\" target=\"_blank\">10.1016\/j.compedu.2011.09.022<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+effects+of+presentation+method+and+information+density+on+visual+search+ability+and+working+memory+load&rft.jtitle=Computers+and+Education&rft.aulast=Chang%2C+T.-W.%3B+Kinshuk%3B+Chen%2C+N.-S-.%3B+Yu%2C+P.-T.&rft.au=Chang%2C+T.-W.%3B+Kinshuk%3B+Chen%2C+N.-S-.%3B+Yu%2C+P.-T.&rft.date=2012&rft.volume=58&rft.issue=2&rft.pages=721-731&rft_id=info:doi\/10.1016%2Fj.compedu.2011.09.022&rfr_id=info:sid\/en.wikipedia.org:Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ShneidermanTheEyes96-29\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ShneidermanTheEyes96_29-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Shneiderman, B. (1996). \"The eyes have it: A task by data type taxonomy for information visualizations\". <i>Proceedings of the IEEE Symposium on Visual Languages, 1996<\/i> <b>1996<\/b>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FVL.1996.545307\" target=\"_blank\">10.1109\/VL.1996.545307<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+eyes+have+it%3A+A+task+by+data+type+taxonomy+for+information+visualizations&rft.jtitle=Proceedings+of+the+IEEE+Symposium+on+Visual+Languages%2C+1996&rft.aulast=Shneiderman%2C+B.&rft.au=Shneiderman%2C+B.&rft.date=1996&rft.volume=1996&rft_id=info:doi\/10.1109%2FVL.1996.545307&rfr_id=info:sid\/en.wikipedia.org:Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<\/ol><\/div>\n<h2><span class=\"mw-headline\" id=\"Notes\">Notes<\/span><\/h2>\n<p>This presentation is faithful to the original, with only a few minor changes to presentation. In several cases the PubMed ID was missing and was added to make the reference more useful. The original article lists references alphabetically, but this version \u2014 by design \u2014 lists them in order of appearance.\n<\/p>\n<!-- \nNewPP limit report\nCached time: 20181214185736\nCache expiry: 86400\nDynamic content: false\nCPU time usage: 0.809 seconds\nReal time usage: 1.652 seconds\nPreprocessor visited node count: 23979\/1000000\nPreprocessor generated node count: 37459\/1000000\nPost\u2010expand include size: 202907\/2097152 bytes\nTemplate argument size: 63971\/2097152 bytes\nHighest expansion depth: 18\/40\nExpensive parser function count: 0\/100\n-->\n\n<!-- \nTransclusion expansion time report (%,ms,calls,template)\n100.00% 709.453 1 - -total\n 85.35% 605.547 1 - Template:Reflist\n 74.80% 530.685 29 - Template:Citation\/core\n 73.76% 523.305 27 - Template:Cite_journal\n 10.86% 77.044 59 - Template:Citation\/identifier\n 8.91% 63.233 1 - Template:Infobox_journal_article\n 8.54% 60.563 1 - Template:Infobox\n 5.14% 36.469 80 - Template:Infobox\/row\n 3.95% 27.994 127 - Template:Hide_in_print\n 3.82% 27.127 29 - Template:Citation\/make_link\n-->\n\n<!-- Saved in parser cache with key limswiki:pcache:idhash:10427-0!*!0!!en!5!*!math=5 and timestamp 20181214185735 and revision id 32577\n -->\n<\/div><div class=\"printfooter\">Source: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory\">https:\/\/www.limswiki.org\/index.php\/Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory<\/a><\/div>\n\t\t\t\t\t\t\t\t\t\t<!-- end content -->\n\t\t\t\t\t\t\t\t\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t\t<!-- end of the left (by default at least) column -->\n\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t\t\n\t\t<\/div>\n\t\t\n\n<\/body>","428fb6eb50c74d741daa88c4061eeab2_images":["https:\/\/www.limswiki.org\/images\/b\/bf\/Fig1_Crisan_PeerJ2018_6.jpg","https:\/\/www.limswiki.org\/images\/1\/1c\/Fig2_Crisan_PeerJ2018_6.jpg","https:\/\/www.limswiki.org\/images\/b\/bb\/Fig3_Crisan_PeerJ2018_6.jpg","https:\/\/www.limswiki.org\/images\/0\/0a\/Fig4_Crisan_PeerJ2018_6.jpg","https:\/\/www.limswiki.org\/images\/6\/66\/Fig5_Crisan_PeerJ2018_6.jpg","https:\/\/www.limswiki.org\/images\/2\/27\/Fig6_Crisan_PeerJ2018_6.jpg","https:\/\/www.limswiki.org\/images\/c\/c4\/Fig7_Crisan_PeerJ2018_6.jpg"],"428fb6eb50c74d741daa88c4061eeab2_timestamp":1544813855,"8159b0ee46c6326792ce28d0e7506e33_type":"article","8159b0ee46c6326792ce28d0e7506e33_title":"Characterizing and managing missing structured data in electronic health records: Data analysis (Beaulieu-Jones et al. 2018)","8159b0ee46c6326792ce28d0e7506e33_url":"https:\/\/www.limswiki.org\/index.php\/Journal:Characterizing_and_managing_missing_structured_data_in_electronic_health_records:_Data_analysis","8159b0ee46c6326792ce28d0e7506e33_plaintext":"\n\n\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\n\n\t\t\t\tJournal:Characterizing and managing missing structured data in electronic health records: Data analysis\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tFrom LIMSWiki\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\tJump to: navigation, search\n\n\t\t\t\t\t\n\t\t\t\t\tFull article title\n \nCharacterizing and managing missing structured data in electronic health records: Data analysisJournal\n \nJMIR Medical InformaticsAuthor(s)\n \nBeaulieu-Jones, Brett K.; Lavage, Daniel R.; Snyder, John W.; Moore, Jason H.; Pendergrass, Sarah A.; Bauer, Christopher R.Author affiliation(s)\n \nUniversity of Pennsylvania, GeisingerPrimary contact\n \nEmail: cbauer at geisinger dot eduEditors\n \nEysenbach, G.Year published\n \n2018Volume and issue\n \n6 (1)Page(s)\n \ne11DOI\n \n10.2196\/medinform.8960ISSN\n \n2291-9694Distribution license\n \nCreative Commons Attribution 4.0 InternationalWebsite\n \nhttp:\/\/medinform.jmir.org\/2018\/1\/e11\/Download\n \nhttp:\/\/medinform.jmir.org\/2018\/1\/e11\/pdf (PDF)\n\nContents\n\n1 Abstract \n2 Introduction \n\n2.1 Justification \n2.2 Background \n2.3 Objective \n\n\n3 Methods \n\n3.1 Source code \n3.2 Electronic health record data processing \n3.3 Variable selection \n3.4 Predicting the presence of data \n3.5 Sampling of complete cases \n3.6 Simulation of missing data \n3.7 Imputation of missing data \n\n\n4 Results \n5 Discussion \n\n5.1 Principal results \n\n\n6 Conclusions \n7 Acknowledgements \n\n7.1 Authors' contributions \n\n\n8 Conflicts of interest \n9 Abbreviations \n10 Additional files \n11 References \n12 Notes \n\n\n\nAbstract \nBackground: Missing data is a challenge for all studies; however, this is especially true for electronic health record (EHR)-based analyses. Failure to appropriately consider missing data can lead to biased results. While there has been extensive theoretical work on imputation, and many sophisticated methods are now available, it remains quite challenging for researchers to implement these methods appropriately. Here, we provide detailed procedures for when and how to conduct imputation of EHR laboratory results.\nObjective: The objective of this study was to demonstrate how the mechanism of \"missingness\" can be assessed, evaluate the performance of a variety of imputation methods, and describe some of the most frequent problems that can be encountered.\nMethods: We analyzed clinical laboratory measures from 602,366 patients in the EHR of Geisinger Health System in Pennsylvania, USA. Using these data, we constructed a representative set of complete cases and assessed the performance of 12 different imputation methods for missing data that was simulated based on four mechanisms of missingness (missing completely at random, missing not at random, missing at random, and real data modelling).\nResults: Our results showed that several methods, including variations of multivariate imputation by chained equations (MICE) and softImpute, consistently imputed missing values with low error; however, only a subset of the MICE methods was suitable for multiple imputation.\nConclusions: The analyses we describe provide an outline of considerations for dealing with missing EHR data, steps that researchers can perform to characterize missingness within their own data, and an evaluation of methods that can be applied to impute clinical data. While the performance of methods may vary between datasets, the process we describe can be generalized to the majority of structured data types that exist in EHRs, and all of our methods and code are publicly available.\nKeywords: imputation, missing data, clinical laboratory test results, electronic health records \n\nIntroduction \nJustification \nMissing data present a challenge to researchers in many fields, and this challenge is growing as datasets increase in size and scope. This is especially problematic for electronic health records (EHRs), where missing values frequently outnumber observed values. EHRs were designed to record and improve patient care and streamline billing rather than act as resources for research[1]; thus, there are significant challenges to using these data to gain a better understanding of human health. As EHR data become increasingly used as a source of phenotypic information for biomedical research[2], it is crucial to develop strategies for coping with missing data.\nClinical laboratory assay results are a particularly rich data source within the EHR, but they also tend to have large amounts of missing data. These data may be missing for many different reasons. Some tests are used for routine screening, but screening may be biased. Other tests are only conducted if they are clinically relevant to very specific ailments. Patients may also receive care at multiple health care systems, resulting in information gaps at each institution. Age, sex, socioeconomic status, access to care, and medical conditions can all affect how comprehensive the data are for a given patient. Accounting for the mechanisms that cause data to be missing is critical, since failure to do so can lead to biased conclusions.\n\nBackground \nAside from the uncertainty associated with a variable that is not observed, many analytical methods, such as regression or principal components analysis, are designed to operate only on a complete dataset. The easiest way to implement these procedures is to remove variables with missing values or remove individuals with missing values. Eliminating variables is justifiable in many situations, especially if a given variable has a large proportion of missing values, but doing so may restrict the scope and power of a study. Removing individuals with missing data is another option known as complete-case analysis. This is generally not recommended unless the fraction of individuals that will be removed is small enough to be considered trivial, or there is good reason to believe that the absence of a value is due to random chance. If there are systematic differences between individuals with and without observations, complete-case analysis will be biased.\nAn alternative approach is to fill in the fields that are missing data with estimates. This process, called imputation, requires a model that makes assumptions about why only some values were observed. \"Missingness\" mechanisms fall somewhere in a spectrum between three scenarios (Figure 1).\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 1: Two general paradigms are commonly used to describe missing data. Missing data are considered ignorable if the probability of observing a variable has no relation to the value of the observed variable and are considered nonignorable otherwise. The second paradigm divides missingness into three categories: missing completely at random (MCAR: the probability of observing a variable is not dependent on its value or other observed values), missing at random (MAR: the probability of observing a variable is not dependent on its own value after conditioning on other observed variables), and missing not at random (MNAR: the probability of observing a variable is dependent on its value, even after conditioning on other observed variables). The x-axis indicates the extent to which a given value being observed depends on other values of other observed variables. The y-axis indicates the extent to which a given value being observed depends on its own value. \n\n\n\nWhen data are missing in a manner completely unrelated to both the observed and unobserved values, they are considered to be missing completely at random (MCAR).[3][4] When data are MCAR, the observed data represent a random sample of the population, but this is rarely encountered in practice. Conversely, data missing not at random (MNAR) refers to a situation where the probability of observing a data point depends on the value of that data point.[5] In this case, the mechanism responsible for the missing data is biased and should not be considered ignorable.[6] For example, rheumatoid factor is an antibody detectable in blood, and the concentration of this antibody is correlated with the presence and severity of rheumatoid arthritis. This test is typically performed only for patients with some indication of rheumatoid arthritis. Thus, patients with high rheumatoid factor levels are more likely to have rheumatoid factor measures.\nA more complicated scenario can arise when multiple variables are available. If the probability of observing a data point does not depend on the value of that data point, after conditioning on one or more additional variables, then that data point is said to be missing at random (MAR).[5] For example, a variable, X, may be MNAR if considered in isolation. However, if we observe another variable, Y, that explains some of the variation in X such that, after conditioning on Y, the probability of observing X is no longer related to its own value, then X is said to be MAR. In this way, Y can transform X from MNAR to MAR (Figure 1). We cannot prove that X is randomly sampled unless we measure some of the unobserved values, but strong correlations, the ability to explain missingness, and domain knowledge may provide evidence that the data are MAR.\nImputation methods assume specific mechanisms of missingness, and assumption violations can lead to bias in the results of downstream analyses that can be difficult to predict.[7][8] Variances of imputed values are often underestimated, causing artificially low P values.[9] Additionally, for data MNAR, the observed values have a different distribution from the missing values. To cope with this, a model can be specified to represent the missing data mechanism, but such models can be difficult to evaluate and may have a large impact on results. Great caution should be taken when handling missing data, particularly data that are MNAR. Most imputation methods assume that data are MAR or MCAR, but it is worth reiterating that these are all idealized states, and real data invariably fall somewhere in between (Figure 1).\n\nObjective \nWe aimed to provide a framework for characterizing and understanding the types of missing data present in the EHR. We also developed an open-source framework that other researchers can follow when dealing with missing data.\n\nMethods \nSource code \nWe provide the source code to reproduce this work in our repository on GitHub (GitHub, Inc.)[10] under a permissive open source license. In addition, we used continuous analysis[11] to generate Docker Hub (Docker Inc.) images matching the environment of the original analysis and to create intermediate results and logs. These artifacts are freely available.[12]\n\nElectronic health record data processing \nAll laboratory assays were mapped to Logical Observation Identifiers Names and Codes (LOINC). We restricted our analysis to outpatient laboratory results to minimize the effects of extreme results from inpatient and emergency department data. We used all laboratory results dated between August 8, 1996 and March 3, 2016, excluding codes for which less than 0.5% of patients had a result. The resulting dataset consisted of 669,212 individuals and 143 laboratory assays.\nWe removed any laboratory results that were obtained prior to the patient\u2019s 18th birthday or after their 90th. In cases where a date of death was present, we also removed laboratory results that were obtained within one year of death, as we found that the frequency of observations often spiked during this period and the values for certain laboratory tests were altered for patients near death. For each patient, a median date of observation was calculated based on their remaining laboratory results. We defined a temporal window of observation by removing any laboratory results recorded more than five years from the median date. We then calculated the median result of the remaining laboratory tests for each patient. As each variable had a different scale and many deviated from normality, we applied Box-Cox and Z-transformations to all variables. The final dataset used for all downstream analyses contained 602,366 patients and 146 variables (age, sex, body mass index [BMI], and 143 laboratory measures).\n\nVariable selection \nWe first ranked the laboratory measures by total amount of missingness, lowest to highest. At each rank, we calculated the percentage of complete cases for the set, including all lower-ranked measures. We also built a random forest classifier to predict the presence or absence of each variable. Based on these results and domain knowledge, we selected 28 variables that provided a reasonable trade-off between quantity and completeness and that we deemed to be largely MAR.\n\nPredicting the presence of data \nFor each clinical laboratory measure, we used the scikit-learn[13] random forest classifier, to predict whether each value would be present. Each laboratory measure was converted to a binary label vector based on whether the measure was recorded. The values of all other laboratory measures, excluding comembers of a panel, were used as the training matrix input to the random forest. This process was repeated for each laboratory test using 10-fold cross-validation. We assessed prediction accuracy by the area under the receiver operating characteristic curve (AUROC) using the trapezoidal rule.\n\nSampling of complete cases \nTo generate a set of complete cases that resembled the whole population, we randomly sampled 100,000 patients without replacement. We then matched each of these individuals to the most similar patient who had a value for each of the 28 most common laboratory tests by matching sex and finding the minimal euclidean distance of age and BMI.\n\nSimulation of missing data \nWithin the sampled complete cases, we selected the data for removal by four mechanisms:\nSimulation 1: Missing Completely at Random\nWe replaced values with NaN (indicator of missing data) at random. We repeated this procedure 10 times each for 10%, 20%, 30%, 40%, and 50% missingness, yielding 50 simulated datasets.\nSimulation 2: Missing at Random\nWe selected two columns (A and B) and a quartile. For the values from column A within the quartile, we randomly replaced 50% of the values from column B with NaN. We repeated the procedure for each quartile and each laboratory test combination, yielding 3024 simulated datasets.\nSimulation 3: Missing Not at Random\nWe selected a column and a quartile. When the column\u2019s value was in the quartile, we replaced it with NaN 50% of the time. We repeated this procedure for each of the four quartiles of each of the 28 laboratory values, generating a total 112 total simulated datasets.\nSimulation 4: Missingness Based on Real Data Observations\nFrom our complete-cases dataset, we matched each patient to the nearest neighbor, excluding self-matches, in the entire population based on their sex, age, and BMI. We then replaced any laboratory value in the complete cases with NaN if it was absent in the matched patient.\n\nImputation of missing data \nUsing our simulated datasets (simulations 1-4), we compared 18 common imputation methods (12 representative methods are shown in the figures below) from the fancyimpute[14] and the multivariate imputation by chained equations (MICE v2.30)[15] packages. Multimedia Appendix 1 (table) shows a full list of imputation methods and the parameters used for each.\n\nResults \nOur first step was to select a subset of the 143 laboratory measures for which imputation would be a reasonable approach. We began by ranking the clinical laboratory measures in descending order by the number of patients who had an observed value for that test. For each ranked laboratory test, we plotted the percentage of individuals missing a value, as well as the percentage of complete cases when that given test was joined with all the tests with lower ranks (i.e., less missingness). These plots showed that the best trade-off between quantity of data and completeness was between 20 and 30 variables (Figure 2, part A). Beyond the 30 most common laboratory tests, the number of complete cases rapidly approached zero.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 2: Summary of missing data across 143 clinical laboratory measures. (A) After ranking the clinical laboratory measures by the number of total results, the percentage of patients missing a result for each test was plotted (red points). At each rank, the percentage of complete cases for all tests of equal or lower rank were also plotted (blue points). Only variables with a rank \u226475 are shown. The vertical bar indicates the 28 tests that were selected for further analysis. (B) The full distribution of patient median ages is shown in blue, and the fraction of individuals in each age group that had a complete set of observations for tests 1-28 are shown in red. (C) Within the 28 laboratory tests that were selected for imputation analyses, the mean number of missing tests is depicted as a function of age. (D) Within the 28 laboratory tests that were selected for imputation, the mean number of missing tests is depicted as a function of body mass index (BMI). (E) Accuracy of a random forest predicting the presence or absence of all 143 laboratory tests. AUROC: area under the receiver operating characteristic curve. (F) Accuracy of a random forest predicting the presence or absence of the top 28 laboratory tests, by Logical Observation Identifiers Names and Codes (LOINC).\n\n\n\nAs age, sex, and BMI have a considerable impact on what clinical laboratory measures are collected, we evaluated the relationship between missingness and these covariates (Figure 2, parts B-D). We also used a random forest approach to predict the presence or absence of each measure based on the values of the other observed measures. MCAR data are not predictable, resulting in AUROCs near 0.5. Only 38 of the 143 laboratory tests had AUROCs less than 0.55 (Figure 2, part E). Very high AUROCs are most consistent with data that are MAR. For the top 30 candidate clinical laboratory measures based on the number of complete cases, the mean AUROC was 0.82. This suggested that the observed data could explain much of the mechanism responsible for the missing data within this set. We ultimately decided not to include the 29th-ranked laboratory test, specific gravity of urine (2965-2), since it had an AUROC of only 0.69 and is typically used for screening only within urology or nephrology departments (RV Levy, MD, personal conversation, June 2017). We included the lipid measures (ranks 25-28) since they had AUROC values near 0.82 and are recommended for screening of patients depending on age, sex, and BMI.[16] Our data confirmed that age, sex, and BMI all predicted the presence of lipid measures (Multimedia Appendix 1, fig 1A-B).\nTo assess the accuracy of imputation methods, we required known values to compare with imputed values. Thus, we restricted our analysis to a subset of patients who were complete cases for the 28 selected variables (Table 1).[17] Since the characteristics of this subset differed from those of the broader population (Figure 2, parts B-D), we used sampling and k-nearest neighbors (KNN) matching to generate a subset of the complete cases that better resembled the overall population. We then simulated missing data within this set by four mechanisms: MCAR, MAR, MNAR, and realistic patterns based on the original data.\n\n\n\n\n\n\n\nTable 1. Logical Observation Identifiers Names and Codes (LOINC) and descriptions of the most frequently ordered clinical laboratory measurements. The assays are ranked from the most common to the least.\r\n \r\naHDL: high-density lipoprotein; bLDL: low-density lipoprotein\n\n\nLOINC\n\nDescription\n\n\n718-7\n\nHemoglobin [Mass\/volume] in Blood\n\n\n4544-3\n\nHematocrit [Volume Fraction] of Blood by Automated count\n\n\n787-2\n\nErythrocyte mean corpuscular volume [Entitic volume] by Automated count\n\n\n786-4\n\nErythrocyte mean corpuscular hemoglobin concentration [Mass\/volume] by Automated count\n\n\n785-6\n\nErythrocyte mean corpuscular hemoglobin [Entitic mass] by Automated count\n\n\n6690-2\n\nLeukocytes [#\/volume] in Blood by Automated count\n\n\n789-8\n\nErythrocytes [#\/volume] in Blood by Automated count\n\n\n788-0\n\nErythrocyte distribution width [Ratio] by Automated count\n\n\n32623-1\n\nPlatelet mean volume [Entitic volume] in Blood by Automated count\n\n\n777-3\n\nPlatelets [#\/volume] in Blood by Automated count\n\n\n2345-7\n\nGlucose [Mass\/volume] in Serum or Plasma\n\n\n2160-0\n\nCreatinine [Mass\/volume] in Serum or Plasma\n\n\n2823-3\n\nPotassium [Moles\/volume] in Serum or Plasma\n\n\n3094-0\n\nUrea nitrogen [Mass\/volume] in Serum or Plasma\n\n\n2951-2\n\nSodium [Moles\/volume] in Serum or Plasma\n\n\n2075-0\n\nChloride [Moles\/volume] in Serum or Plasma\n\n\n2028-9\n\nCarbon dioxide, total [Moles\/volume] in Serum or Plasma\n\n\n17861-6\n\nCalcium [Mass\/volume] in Serum or Plasma\n\n\n1743-4\n\nAlanine aminotransferase [Enzymatic activity\/volume] in Serum or Plasma by With P-5'-P\n\n\n30239-8\n\nAspartate aminotransferase [Enzymatic activity\/volume] in Serum or Plasma by With P-5'-P\n\n\n1975-2\n\nBilirubin.total [Mass\/volume] in Serum or Plasma\n\n\n2885-2\n\nProtein [Mass\/volume] in Serum or Plasma\n\n\n10466-1\n\nAnion gap 3 in Serum or Plasma\n\n\n751-8\n\nNeutrophils [#\/volume] in Blood by Automated count\n\n\n2093-3\n\nCholesterol [Mass\/volume] in Serum or Plasma\n\n\n2571-8\n\nTriglyceride [Mass\/volume] in Serum or Plasma\n\n\n2085-9\n\nCholesterol in HDLa [Mass\/volume] in Serum or Plasma\n\n\n13457-7\n\nCholesterol in LDLb [Mass\/volume] in Serum or Plasma by calculation\n\n\n\nWe next evaluated our ability to predict the presence of each value in the simulated datasets. These simulations confirmed that our MCAR simulation had a low AUROC (Figure 3, part A). The MAR data (Figure 3, part B) and MNAR data (Figure 3, part C) were often well predicted, particularly for the MAR data and when data were missing from the tails of distributions. The AUROCs rarely exceeded 0.75 in the MNAR simulations, while values above 0.75 were typical in the MAR simulations. This provided additional support for our decision to restrict our focus to the top 28 laboratory measures, since they all had AUROCs between 0.9 and 0.75, which was outside the range of MNAR simulations (Figure 2, part F and Figure 3, part C).\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 3: Area under the receiver operating characteristic curve (AUROC) of a random forest predicting whether data will be present or missing. (A) Missing completely at random simulation. (B) Missing at random simulation. (C) Missing not at random simulation.\n\n\n\nWe chose to test the accuracy of imputation for several methods fromtwo popular and freely available libraries: the MICE package for R and the fancyimpute library for Python. We first applied each of these methods across simulations 1 to 3. For each combination, Figure 4 depicts the overall root mean square errors. Multimedia Appendix 1 (Supplemental Table and Figures 3-21) shows a breakdown of all the methods and parameters.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 4: Imputation accuracy measured by root mean square error (RMSE) across simulations 1-3. (A) Missing completely at random (MCAR). (B) Missing at random (MAR). (C) Missing not at random (MNAR). FI: fancyimpute; KNN: k-nearest neighbors; MICE: Multivariate Imputation by Chained Equations; pmm: predictive mean matching; RF: random forest; SVD: singular value decomposition.\n\n\n\nWe next measured imputation accuracy based on the patterns of missingness that we observed in the real data (Figure 5). The main difference compared with simulations 1 to 3 was lower error for some of the deterministic methods (mean, median, and KNN). It is worth mentioning that the error was highly dependent on the variable that was being imputed. Specifically, for the fancyimpute MICE predictive mean matching (pmm) method, multicollinearity within some of the variables caused convergence failures that led to extremely large errors (Figure 5, method MICE pmm [FI]). These factors were relatively easy to address in the R package MICE pmm method by adjusting the predictor matrix.[15]\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 5: Imputation root mean square error (RMSE) for a subset of 10,000 patients from simulation 4. A total of 12 imputation methods were tested (x-axis), and each color corresponds to a Logical Observation Identifiers Names and Codes (LOINC) code. The black line shows the theoretical error from random sampling. FI: fancyimpute; KNN: k-nearest neighbors; MICE: Multivariate Imputation by Chained Equations; pmm: predictive mean matching; RF: random forest; SVD: singular value decomposition.\n\n\n\nIn addition to evaluating the accuracy of imputation, it is also important to estimate the uncertainty associated with imputation. One approach to address this is multiple imputation, where each data point is imputed multiple times using a nondeterministic method. To determine whether each method properly captured the true uncertainty of the data, we compared the error between an imputed dataset and the observed data versus the error between two sets of imputed values for each method (Figure 6). If these errors are equal, then multiple imputation is likely producing good estimates of uncertainty. If, however, the error between two imputed datasets is less than that between each imputed dataset and the known values, then the imputation method is likely underestimating the variance.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 6: Assessment of multiple imputation for each method. Using simulation 4, missing values were imputed multiple times with each method. The x-axes show the root mean square error (RMSE) between the imputed data and the observed values. The y-axes show the RMSE between multiple imputations of the same data. The axis scales vary between panels to better show the range of variation. The laboratory tests are indicated by the color of the points. The black diagonal line represents unity (y=x). Panels are ordered by each method\u2019s mean deviation (MD) from unity, indicated in the top left corner of each panel. In the last 7 panels, the unity line is not visible because the variation between multiple imputations was close to zero. FI: fancyimpute; KNN: k-nearest neighbors; MICE: Multivariate Imputation by Chained Equations; pmm: predictive mean matching; RF: random forest; SVD: singular value decomposition.\n\n\n\nOur results (Figure 6) demonstrate that many of the imputation methods are not suitable for multiple imputation. Of the methods that had the lowest error in the MCAR, MAR, and MNAR simulations, we found three (softImpute, MICE col [fancyimpute], MICE norm.pred [R]) to have minimal variation between imputations. This was also true of KNN, singular value decomposition (SVD), mean, and median imputation. Only three methods (random sampling, MICE norm [R], and MICE pmm [R]) seemed to have similar error between the multiple imputations and the observed data and thus appear to be unbiased. The latter two had very similar performances and are the best candidates for multiple imputation. Two methods had intermediate performance. MICE random forest (R) was similar to several other MICE methods in terms of error relative to the observed data, but it produced slightly less variation between each imputed dataset. This seemed to affect some variables more than others but there was no obvious pattern. The MICE pmm (fancyimpute) was not deterministic but it did seem to achieve low error at the expense of increased bias. In this case, the variables that could be imputed with the lowest error also seemed to have the most bias. Since this method claims to be a reimplementation of the MICE pmm (R) method, this may be due to multicollinearity among the variables that could not easily be accounted for, as there was no simple way to alter the predictor matrix.\n\nDiscussion \nPrincipal results \nIt is not possible, or even desirable, to choose \u201cthe best\u201d imputation method. There are many considerations that may not be generalizable between different sets of data; however, we can draw some general conclusions about how different methods compare in terms of error, bias, complexity, and difficulty of implementation. Based on our results, there seem to be three broad categories of methods.\nThe first category is the simple deterministic methods. These include mean or median imputation and KNN. While easy to implement, mean or median imputation may lead to severe bias and large errors if the unobserved data are more likely to come from the tails of the observed distribution (Figure 4, parts A-C, methods mean, median, and KNN). This will also cause the variance of the distribution to be underestimated if more than a small fraction of the data is missing. Since these methods are deterministic, they are also not suitable for multiple imputation (Figure 6, bottom row).\nKNN is a popular choice for imputation that has been shown to perform very well for some types of data[18][19], but it was not particularly well suited for our data, regardless of the choice of k. This may be due to issues of data dimensionality[20] or to individuals not falling into well-separated groups based on their clinical laboratory results. This method is also not suitable for large datasets, since a distance matrix for all pairs of individuals is stored in memory during computation, and the size of the distance matrix scales with n2.\nThe second category of algorithms could be called the sophisticated deterministic methods. These include SVD, softImpute, MICE col, and MICE norm.pred. SVD performed poorly compared with its counterparts and sometimes produced errors greater than simple random sampling (Figure 5, method SVD). The reasons for this are not clear, but we cannot recommend this method. SoftImpute, MICE col, and MICE norm.pred were among the lowest-error methods in all of our simulations (Figure 5, methods MICE col and norm.pred). The main limitation of these methods is that they cannot be used for multiple imputation (Figure 6, middle row).\nThe third broad category of algorithms comprises the stochastic methods, which included random sampling and most of the remaining methods in the MICE library. Random sampling almost always produced the highest error (Figure 4 and Figure 5, method random sample), but it has the advantage of being easy to implement and it requires no parameter selection. The MICE methods based on pmm, random forests, and Bayesian linear regression tended to perform similarly in terms of error in most of our simulations (Figure 4 and Figure 5, methods MICE pmm, RF, and norm).\nImputation methods that involve stochasticity allow for a fundamentally different type of analysis called multiple imputation. In this paradigm, multiple imputed datasets (a minimum of three and often 10-20 depending on the percentage of missing data)[21][22][23] are generated, and each is analyzed in the same way. At the end of all downstream analyses, the results are then compared. Typically, the ultimate result of interest is supported by a P value, a regression coefficient, an odds ratio, etc. In the case of a multiply imputed dataset, the researcher will have several output statistics that can be used to estimate a confidence interval for the result.\nMultiple imputation has been gaining traction recently, and the MICE package has become one of the most popular choices for implementing this procedure. This package is powerful and very well documented[15] but, like all methods for imputation, caution must be exercised. In MICE, each variable is imputed one by one. This entire process is then repeated for a number of iterations such that the values imputed in one iteration can update the estimates for the next iteration. The result is a chain of imputed datasets, and this entire process is typically performed in parallel so that multiple chains are generated.\nIn MICE, several choices must be made. The first obvious choice is the imputation method (i.e., equation). Many methods are available in the base package, additional methods can be added from other packages[24], and users can even define their own. We thoroughly evaluated three methods in the context of our dataset: pmm, Bayesian linear regression (norm), and random forest.\nThe pmm is the default choice, and it can be used on a mixture of numeric and categorical variables. We found pmm to have a good trade-off between error and bias, but for our dataset it was critical to remove several variables from the predictor matrix due to strong correlations (R>.85) and multicollinearity. Bayesian regression performed similarly but was less sensitive to these issues. If a dataset contains only numeric values, Bayesian regression may be a safer option. Random forest tended to produce results that were slightly biased for a subset of the variables without an appreciable reduction in error. Aside from random sampling, none of the other methods we evaluated were suitable for multiple imputation (Figure 6).\n\nConclusions \nMany factors must be considered when analyzing a dataset with missing values. This starts by determining whether each variable should be considered at all. Two good reasons to reject a variable are if it has too many missing values or if it is likely to be MNAR. If a variable is deemed to be MNAR, it may still be possible to impute, but the mechanism of missingness should be explicitly modeled, and a sensitivity analysis is recommended to assess how much impact this could have on the final results.[25][26] While a statistical model of the mechanism of missingness is useful, there is no substitute for a deep familiarity with the data at hand and how they were generated.\nHaving selected the data, one must select an imputation method. Ideally, several methods should be tested in a realistic setting. Great care should be taken to construct a set of complete data that closely resemble all of the relevant characteristics of the data that one wishes to impute. Similar care should then be taken to remove some of these data in ways that closely resemble the observed patterns of missingness. If this is not feasible, one may also simulate a variety of datasets representing a range of possible data structures and missingness mechanisms. Any available imputation methods can then be applied to the simulated data, and error between the imputed data and their known values provide a metric of performance.\nWhile the minimization of error is an important goal, a singular focus on this objective is likely to lead to bias. For each missing value, it is also important to estimate the uncertainty associated with it. This can be achieved by multiple imputation using an algorithm that incorporates stochastic processes. Multiple imputation has become the field standard because it provides confidence intervals for the results of downstream analyses. One should not naively assume that any stochastic process is free of bias. It is important to check that multiple imputation is providing variability that corresponds to the actual uncertainty of the imputed values using a set of simulated data.\n\nAcknowledgements \nWe thank Dr. Casey S. Greene (University of Pennsylvania) for his helpful discussions. We also thank Dr. Rebecca V. Levy (Geisinger) for providing expert clinical domain knowledge.\nThis work was supported by the Commonwealth Universal Research Enhancement Program grant from the Pennsylvania Department of Health. BBJ and JM were also supported by US National Institutes of Health grants AI116794 and LM010098 to JM.\n\nAuthors' contributions \nBBJ, JM, SAP, and CRB conceived of the study. DRL and JWS performed data processing. BBJ and CRB performed analyses. BBJ, SAP, and CRB wrote the manuscript, and all authors revised and approved the final manuscript.\n\nConflicts of interest \nNone declared.\n\nAbbreviations \nAUROC: area under the receiver operating characteristic curve\nBMI: body mass index\nEHR: electronic health record\nKNN: k-nearest neighbors\nLOINC: Logical Observation Identifiers Names and Codes\nMAR: missing at random\nMCAR: missing completely at random\nMICE: Multivariate Imputation by Chained Equations\nMNAR: missing not at random\npmm: predictive mean matching\nSVD: singular value decomposition\n\nAdditional files \nSupplemental table and figures: PDF File (Adobe PDF File), 4MB\n\nReferences \n\n\n\u2191 Steinbrook, R.. \"Health Care and the American Recovery and Reinvestment Act\". New England Journal of Medicine 360 (11): 1057\u20131060. doi:10.1056\/NEJMp0900665. PMID 19224738.   \n\n\u2191 Flintoft, L.. \"Disease genetics: Phenome-wide association studies go large\". Nature Reviews Genetics 15 (1): 2. doi:10.1038\/nrg3637. PMID 24322724.   \n\n\u2191 Wells, B.J.; Chagin, K.M.; Nowacki, A.S.; Kattan, M.W.. \"Strategies for handling missing data in electronic health record derived data\". EGEMS 1 (3): 1035. doi:10.13063\/2327-9214.1035. PMC PMC4371484. PMID 25848578. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4371484 .   \n\n\u2191 Bounthavong, M.; Watanabe, J.H.; Sullivan, K.M.. \"Approach to addressing missing data for electronic medical records and pharmacy claims data research\". Pharmacotherapy 35 (4): 380\u20137. doi:10.1002\/phar.1569. PMID 25884526.   \n\n\u2191 5.0 5.1 Bhaskaran, K.; Smeeth, L.. \"What is the difference between missing completely at random and missing at random?\". International Journal of Epidemiology 43 (4): 1336-9. doi:10.1093\/ije\/dyu080. PMC PMC4121561. PMID 24706730. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4121561 .   \n\n\u2191 Rubin, D.B.. \"Inference and missing data\". Biometrika 63 (3): 581\u2013592. doi:10.1093\/biomet\/63.3.581.   \n\n\u2191 J\u00f6rnsten, R.; Ouyang, M.; Wang, H.Y.. \"A meta-data based method for DNA microarray imputation\". BMC Bioinformatics 8: 109. doi:10.1186\/1471-2105-8-109. PMC PMC1852325. PMID 17394658. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC1852325 .   \n\n\u2191 Beaulieu-Jones, B.K.; Moore, J.H.. \"Missing data imputation in the electronic health record using deeply learned autoencoders\". Pacific Symposium on Biocomputing 22: 207-218. doi:10.1142\/9789813207813_0021. PMC PMC5144587. PMID 27896976. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5144587 .   \n\n\u2191 Allison, P.D. (2002). Missing Data. Quantitative Applications in the Social Sciences. 136. SAGE Publications. doi:10.4135\/9781412985079. ISBN 9781412985079.   \n\n\u2191 \"EpistasisLab\/imputation\". GitHub, Inc. https:\/\/github.com\/EpistasisLab\/imputation . Retrieved 01 December 2017 .   \n\n\u2191 Beaulieu-Jones, B.K.; Greene, C.S.. \"Reproducibility of computational workflows is automated using continuous analysis\". Nature Biotechnology 35 (4): 342-346. doi:10.1038\/nbt.3780. PMID 28288103.   \n\n\u2191 \"brettbj\/ehr-imputation\". Docker, Inc. https:\/\/hub.docker.com\/r\/brettbj\/ehr-imputation\/ . Retrieved 02 December 2017 .   \n\n\u2191 Pedregosa, F.; Varoquaux, G.; Gramfort, A. et al. (2011). \"Scikit-learn: Machine Learning in Python\". Journal of Machine Learning Research 12 (10): 2825\u20132830. http:\/\/www.jmlr.org\/papers\/v12\/pedregosa11a.html .   \n\n\u2191 \"iskandr\/fancyimpute\". GitHub, Inc. https:\/\/github.com\/iskandr\/fancyimpute . Retrieved 15 February 2018 .   \n\n\u2191 15.0 15.1 15.2 van Buuren, S.; Groothuis-Oudshoorn, K. (2011). \"mice: Multivariate Imputation by Chained Equations in R\". Journal of Statistical Software 45 (3). doi:10.18637\/jss.v045.i03.   \n\n\u2191 Helfand, M.; Carson, S. (2008). \"Screening for Lipid Disorders in Adults: Selective Update of 2001 US Preventive Services Task Force Review\". Evidence Syntheses 49. PMID 20722146.   \n\n\u2191 McDonald, C.J.; Huff, S.M.; Suico, J.G. et al. (2003). \"LOINC, a universal standard for identifying laboratory observations: A 5-year update.\". Clinical Chemistry 49 (4): 624\u201333. PMID 12651816.   \n\n\u2191 Beratta, L.; Santaniello, A. (2016). \"Nearest neighbor imputation algorithms: A critical evaluation\". BMC Medical Informatics and Decision Making 16 (Suppl 3): 74. doi:10.1186\/s12911-016-0318-z. PMC PMC4959387. PMID 27454392. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4959387 .   \n\n\u2191 Troyanskaya, O.; Cantor, M.; Sherlock, G. et al. (2001). \"Missing value estimation methods for DNA microarrays\". Bioinformatics 17 (6): 520\u20135. PMID 11395428.   \n\n\u2191 Pestov, V. (2013). \"Is the k-NN classifier in high dimensions affected by the curse of dimensionality?\". Computers & Mathematics with Applications 65 (10): 1427\u201337. doi:10.1016\/j.camwa.2012.09.011.   \n\n\u2191 Stuart, E.A.; Azur, M.; Frangakis, C.; Leaf, P. (2009). \"Multiple imputation with large data sets: A case study of the Children's Mental Health Initiative\". American Journal of Epidemiology 69 (9): 1133-9. doi:10.1093\/aje\/kwp026. PMC PMC2727238. PMID 19318618. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2727238 .   \n\n\u2191 White, I.R.; Royston, P.; Wood, A.M. (2011). \"Multiple imputation using chained equations: Issues and guidance for practice\". Statistics in Medicine 30 (4): 377-99. doi:10.1002\/sim.4067. PMID 21225900.   \n\n\u2191 Bodner, T.E. (2008). \"What Improves with Increased Missing Data Imputations?\". Structural Equation Modeling 15 (4): 651\u201375. doi:10.1080\/10705510802339072.   \n\n\u2191 Robitzsch, A.; Grund, S.; Henke, T.. \"miceadds: Some Additional Multiple Imputation Functions, Especially for 'mice'\". The Comprehensive R Archive Network. https:\/\/cran.r-project.org\/web\/packages\/miceadds\/index.html . Retrieved 15 February 2018 .   \n\n\u2191 H\u00e9raud-Bousquet, V.; Larsen, C.; Carpenter, J. et al. (2012). \"Practical considerations for sensitivity analysis after multiple imputation applied to epidemiological studies with incomplete data\". BMC Medical Research Methodology 12: 73. doi:10.1186\/1471-2288-12-73. PMC PMC3537570. PMID 22681630. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3537570 .   \n\n\u2191 Carpenter, J.R.; Kenward, M.G.; White, I.R. (2007). \"Sensitivity analysis after multiple imputation under missing at random: a weighting approach\". Statistical Methods in Medical Research 16 (3): 259-75. doi:10.1177\/0962280206075303. PMID 17621471.   \n\n\nNotes \nThis presentation is faithful to the original, with only a few minor changes to grammar, spelling, and presentation, including the addition of PMCID and DOI when they were missing from the original reference.\n\n\n\n\n\n\nSource: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Characterizing_and_managing_missing_structured_data_in_electronic_health_records:_Data_analysis\">https:\/\/www.limswiki.org\/index.php\/Journal:Characterizing_and_managing_missing_structured_data_in_electronic_health_records:_Data_analysis<\/a>\n\t\t\t\t\tCategories: LIMSwiki journal articles (added in 2018)LIMSwiki journal articles (all)LIMSwiki journal articles on health informatics\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\n\t\t\n\t\t\tNavigation menu\n\t\t\t\t\t\n\t\t\tViews\n\n\t\t\t\n\t\t\t\t\n\t\t\t\tJournal\n\t\t\t\tDiscussion\n\t\t\t\tView source\n\t\t\t\tHistory\n\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\t\n\t\t\t\tPersonal tools\n\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\t\t\tLog in\n\t\t\t\t\t\t\t\t\t\t\t\t\tRequest account\n\t\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\t\n\t\tNavigation\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tMain page\n\t\t\t\t\t\t\t\t\t\t\tRecent changes\n\t\t\t\t\t\t\t\t\t\t\tRandom page\n\t\t\t\t\t\t\t\t\t\t\tHelp\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tSearch\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t \n\t\t\t\t\t\t\n\t\t\t\t\n\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tTools\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tWhat links here\n\t\t\t\t\t\t\t\t\t\t\tRelated changes\n\t\t\t\t\t\t\t\t\t\t\tSpecial pages\n\t\t\t\t\t\t\t\t\t\t\tPermanent link\n\t\t\t\t\t\t\t\t\t\t\tPage information\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\tPrint\/export\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tCreate a book\n\t\t\t\t\t\t\t\t\t\t\tDownload as PDF\n\t\t\t\t\t\t\t\t\t\t\tDownload as Plain text\n\t\t\t\t\t\t\t\t\t\t\tPrintable version\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\n\t\tSponsors\n\t\t\n\t\t\t \r\n\n\t\r\n\n\t\r\n\n\t\r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n\t\n\t\r\n\n\t\r\n\n\t\r\n\n\t\r\n\t\t\n\t\t\n\t\t\t\n\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t This page was last modified on 6 March 2018, at 21:42.\n\t\t\t\t\t\t\t\t\tThis page has been accessed 301 times.\n\t\t\t\t\t\t\t\t\tContent is available under a Creative Commons Attribution-ShareAlike 4.0 International License unless otherwise noted.\n\t\t\t\t\t\t\t\t\tPrivacy policy\n\t\t\t\t\t\t\t\t\tAbout LIMSWiki\n\t\t\t\t\t\t\t\t\tDisclaimers\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\t\n\n","8159b0ee46c6326792ce28d0e7506e33_html":"<body class=\"mediawiki ltr sitedir-ltr ns-206 ns-subject page-Journal_Characterizing_and_managing_missing_structured_data_in_electronic_health_records_Data_analysis skin-monobook action-view\">\n<div id=\"rdp-ebb-globalWrapper\">\n\t\t<div id=\"rdp-ebb-column-content\">\n\t\t\t<div id=\"rdp-ebb-content\" class=\"mw-body\" role=\"main\">\n\t\t\t\t<a id=\"rdp-ebb-top\"><\/a>\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t<h1 id=\"rdp-ebb-firstHeading\" class=\"firstHeading\" lang=\"en\">Journal:Characterizing and managing missing structured data in electronic health records: Data analysis<\/h1>\n\t\t\t\t\n\t\t\t\t<div id=\"rdp-ebb-bodyContent\" class=\"mw-body-content\">\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\n\t\t\t\t\t<!-- start content -->\n\t\t\t\t\t<div id=\"rdp-ebb-mw-content-text\" lang=\"en\" dir=\"ltr\" class=\"mw-content-ltr\">\n\n\n<h2><span class=\"mw-headline\" id=\"Abstract\">Abstract<\/span><\/h2>\n<p><b>Background<\/b>: Missing data is a challenge for all studies; however, this is especially true for <a href=\"https:\/\/www.limswiki.org\/index.php\/Electronic_health_record\" title=\"Electronic health record\" target=\"_blank\" class=\"wiki-link\" data-key=\"f2e31a73217185bb01389404c1fd5255\">electronic health record<\/a> (EHR)-based analyses. Failure to appropriately consider missing data can lead to biased results. While there has been extensive theoretical work on imputation, and many sophisticated methods are now available, it remains quite challenging for researchers to implement these methods appropriately. Here, we provide detailed procedures for when and how to conduct imputation of EHR <a href=\"https:\/\/www.limswiki.org\/index.php\/Laboratory\" title=\"Laboratory\" target=\"_blank\" class=\"wiki-link\" data-key=\"c57fc5aac9e4abf31dccae81df664c33\">laboratory<\/a> results.\n<\/p><p><b>Objective<\/b>: The objective of this study was to demonstrate how the mechanism of \"missingness\" can be assessed, evaluate the performance of a variety of imputation methods, and describe some of the most frequent problems that can be encountered.\n<\/p><p><b>Methods<\/b>: We analyzed <a href=\"https:\/\/www.limswiki.org\/index.php\/Clinical_laboratory\" title=\"Clinical laboratory\" target=\"_blank\" class=\"wiki-link\" data-key=\"307bcdf1bdbcd1bb167cee435b7a5463\">clinical laboratory<\/a> measures from 602,366 patients in the EHR of Geisinger Health System in Pennsylvania, USA. Using these data, we constructed a representative set of complete cases and assessed the performance of 12 different imputation methods for missing data that was simulated based on four mechanisms of missingness (missing completely at random, missing not at random, missing at random, and real data modelling).\n<\/p><p><b>Results<\/b>: Our results showed that several methods, including variations of multivariate imputation by chained equations (MICE) and softImpute, consistently imputed missing values with low error; however, only a subset of the MICE methods was suitable for multiple imputation.\n<\/p><p><b>Conclusions<\/b>: The analyses we describe provide an outline of considerations for dealing with missing EHR data, steps that researchers can perform to characterize missingness within their own data, and an evaluation of methods that can be applied to impute clinical data. While the performance of methods may vary between datasets, the process we describe can be generalized to the majority of structured data types that exist in EHRs, and all of our methods and code are publicly available.\n<\/p><p><b>Keywords<\/b>: imputation, missing data, clinical laboratory test results, electronic health records \n<\/p>\n<h2><span class=\"mw-headline\" id=\"Introduction\">Introduction<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"Justification\">Justification<\/span><\/h3>\n<p>Missing data present a challenge to researchers in many fields, and this challenge is growing as datasets increase in size and scope. This is especially problematic for electronic health records (EHRs), where missing values frequently outnumber observed values. EHRs were designed to record and improve patient care and streamline billing rather than act as resources for research<sup id=\"rdp-ebb-cite_ref-SteinbrookHealth09_1-0\" class=\"reference\"><a href=\"#cite_note-SteinbrookHealth09-1\" rel=\"external_link\">[1]<\/a><\/sup>; thus, there are significant challenges to using these data to gain a better understanding of human health. As EHR data become increasingly used as a source of phenotypic <a href=\"https:\/\/www.limswiki.org\/index.php\/Information\" title=\"Information\" target=\"_blank\" class=\"wiki-link\" data-key=\"6300a14d9c2776dcca0999b5ed940e7d\">information<\/a> for biomedical research<sup id=\"rdp-ebb-cite_ref-FlintoftDisease14_2-0\" class=\"reference\"><a href=\"#cite_note-FlintoftDisease14-2\" rel=\"external_link\">[2]<\/a><\/sup>, it is crucial to develop strategies for coping with missing data.\n<\/p><p>Clinical laboratory assay results are a particularly rich data source within the EHR, but they also tend to have large amounts of missing data. These data may be missing for many different reasons. Some tests are used for routine screening, but screening may be biased. Other tests are only conducted if they are clinically relevant to very specific ailments. Patients may also receive care at multiple health care systems, resulting in information gaps at each institution. Age, sex, socioeconomic status, access to care, and medical conditions can all affect how comprehensive the data are for a given patient. Accounting for the mechanisms that cause data to be missing is critical, since failure to do so can lead to biased conclusions.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Background\">Background<\/span><\/h3>\n<p>Aside from the uncertainty associated with a variable that is not observed, many analytical methods, such as regression or principal components analysis, are designed to operate only on a complete dataset. The easiest way to implement these procedures is to remove variables with missing values or remove individuals with missing values. Eliminating variables is justifiable in many situations, especially if a given variable has a large proportion of missing values, but doing so may restrict the scope and power of a study. Removing individuals with missing data is another option known as complete-case analysis. This is generally not recommended unless the fraction of individuals that will be removed is small enough to be considered trivial, or there is good reason to believe that the absence of a value is due to random chance. If there are systematic differences between individuals with and without observations, complete-case analysis will be biased.\n<\/p><p>An alternative approach is to fill in the fields that are missing data with estimates. This process, called imputation, requires a model that makes assumptions about why only some values were observed. \"Missingness\" mechanisms fall somewhere in a spectrum between three scenarios (Figure 1).\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig1_Beaulieu-JonesJMIRMedInfo2018_6-1.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"0ec06a19b94896867cbbe3c324a3c2e5\"><img alt=\"Fig1 Beaulieu-JonesJMIRMedInfo2018 6-1.png\" src=\"https:\/\/www.limswiki.org\/images\/6\/68\/Fig1_Beaulieu-JonesJMIRMedInfo2018_6-1.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 1:<\/b> Two general paradigms are commonly used to describe missing data. Missing data are considered ignorable if the probability of observing a variable has no relation to the value of the observed variable and are considered nonignorable otherwise. The second paradigm divides missingness into three categories: missing completely at random (MCAR: the probability of observing a variable is not dependent on its value or other observed values), missing at random (MAR: the probability of observing a variable is not dependent on its own value after conditioning on other observed variables), and missing not at random (MNAR: the probability of observing a variable is dependent on its value, even after conditioning on other observed variables). The x-axis indicates the extent to which a given value being observed depends on other values of other observed variables. The y-axis indicates the extent to which a given value being observed depends on its own value. <\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>When data are missing in a manner completely unrelated to both the observed and unobserved values, they are considered to be missing completely at random (MCAR).<sup id=\"rdp-ebb-cite_ref-WellsStrat13_3-0\" class=\"reference\"><a href=\"#cite_note-WellsStrat13-3\" rel=\"external_link\">[3]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-BounthavongApproach15_4-0\" class=\"reference\"><a href=\"#cite_note-BounthavongApproach15-4\" rel=\"external_link\">[4]<\/a><\/sup> When data are MCAR, the observed data represent a random sample of the population, but this is rarely encountered in practice. Conversely, data missing not at random (MNAR) refers to a situation where the probability of observing a data point depends on the value of that data point.<sup id=\"rdp-ebb-cite_ref-BhaskaranWhat14_5-0\" class=\"reference\"><a href=\"#cite_note-BhaskaranWhat14-5\" rel=\"external_link\">[5]<\/a><\/sup> In this case, the mechanism responsible for the missing data is biased and should not be considered ignorable.<sup id=\"rdp-ebb-cite_ref-RubinInference76_6-0\" class=\"reference\"><a href=\"#cite_note-RubinInference76-6\" rel=\"external_link\">[6]<\/a><\/sup> For example, rheumatoid factor is an antibody detectable in blood, and the concentration of this antibody is correlated with the presence and severity of rheumatoid arthritis. This test is typically performed only for patients with some indication of rheumatoid arthritis. Thus, patients with high rheumatoid factor levels are more likely to have rheumatoid factor measures.\n<\/p><p>A more complicated scenario can arise when multiple variables are available. If the probability of observing a data point does not depend on the value of that data point, after conditioning on one or more additional variables, then that data point is said to be missing at random (MAR).<sup id=\"rdp-ebb-cite_ref-BhaskaranWhat14_5-1\" class=\"reference\"><a href=\"#cite_note-BhaskaranWhat14-5\" rel=\"external_link\">[5]<\/a><\/sup> For example, a variable, <i>X<\/i>, may be MNAR if considered in isolation. However, if we observe another variable, <i>Y<\/i>, that explains some of the variation in <i>X<\/i> such that, after conditioning on <i>Y<\/i>, the probability of observing <i>X<\/i> is no longer related to its own value, then <i>X<\/i> is said to be MAR. In this way, <i>Y<\/i> can transform <i>X<\/i> from MNAR to MAR (Figure 1). We cannot prove that <i>X<\/i> is randomly sampled unless we measure some of the unobserved values, but strong correlations, the ability to explain missingness, and domain knowledge may provide evidence that the data are MAR.\n<\/p><p>Imputation methods assume specific mechanisms of missingness, and assumption violations can lead to bias in the results of downstream analyses that can be difficult to predict.<sup id=\"rdp-ebb-cite_ref-J.C3.B6rnstenAMeta07_7-0\" class=\"reference\"><a href=\"#cite_note-J.C3.B6rnstenAMeta07-7\" rel=\"external_link\">[7]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-Beaulieu-JonesMissing17_8-0\" class=\"reference\"><a href=\"#cite_note-Beaulieu-JonesMissing17-8\" rel=\"external_link\">[8]<\/a><\/sup> Variances of imputed values are often underestimated, causing artificially low <i>P<\/i> values.<sup id=\"rdp-ebb-cite_ref-AllisonMissing02_9-0\" class=\"reference\"><a href=\"#cite_note-AllisonMissing02-9\" rel=\"external_link\">[9]<\/a><\/sup> Additionally, for data MNAR, the observed values have a different distribution from the missing values. To cope with this, a model can be specified to represent the missing data mechanism, but such models can be difficult to evaluate and may have a large impact on results. Great caution should be taken when handling missing data, particularly data that are MNAR. Most imputation methods assume that data are MAR or MCAR, but it is worth reiterating that these are all idealized states, and real data invariably fall somewhere in between (Figure 1).\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Objective\">Objective<\/span><\/h3>\n<p>We aimed to provide a framework for characterizing and understanding the types of missing data present in the EHR. We also developed an open-source framework that other researchers can follow when dealing with missing data.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Methods\">Methods<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"Source_code\">Source code<\/span><\/h3>\n<p>We provide the source code to reproduce this work in our repository on GitHub (GitHub, Inc.)<sup id=\"rdp-ebb-cite_ref-GHImputation_10-0\" class=\"reference\"><a href=\"#cite_note-GHImputation-10\" rel=\"external_link\">[10]<\/a><\/sup> under a permissive open source license. In addition, we used continuous analysis<sup id=\"rdp-ebb-cite_ref-Beaulieu-JonesRepro17_11-0\" class=\"reference\"><a href=\"#cite_note-Beaulieu-JonesRepro17-11\" rel=\"external_link\">[11]<\/a><\/sup> to generate Docker Hub (Docker Inc.) images matching the environment of the original analysis and to create intermediate results and logs. These artifacts are freely available.<sup id=\"rdp-ebb-cite_ref-DockerImputation_12-0\" class=\"reference\"><a href=\"#cite_note-DockerImputation-12\" rel=\"external_link\">[12]<\/a><\/sup>\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Electronic_health_record_data_processing\">Electronic health record data processing<\/span><\/h3>\n<p>All laboratory assays were mapped to Logical Observation Identifiers Names and Codes (LOINC). We restricted our analysis to outpatient laboratory results to minimize the effects of extreme results from inpatient and emergency department data. We used all laboratory results dated between August 8, 1996 and March 3, 2016, excluding codes for which less than 0.5% of patients had a result. The resulting dataset consisted of 669,212 individuals and 143 laboratory assays.\n<\/p><p>We removed any laboratory results that were obtained prior to the patient\u2019s 18th birthday or after their 90th. In cases where a date of death was present, we also removed laboratory results that were obtained within one year of death, as we found that the frequency of observations often spiked during this period and the values for certain laboratory tests were altered for patients near death. For each patient, a median date of observation was calculated based on their remaining laboratory results. We defined a temporal window of observation by removing any laboratory results recorded more than five years from the median date. We then calculated the median result of the remaining laboratory tests for each patient. As each variable had a different scale and many deviated from normality, we applied Box-Cox and Z-transformations to all variables. The final dataset used for all downstream analyses contained 602,366 patients and 146 variables (age, sex, body mass index [BMI], and 143 laboratory measures).\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Variable_selection\">Variable selection<\/span><\/h3>\n<p>We first ranked the laboratory measures by total amount of missingness, lowest to highest. At each rank, we calculated the percentage of complete cases for the set, including all lower-ranked measures. We also built a random forest classifier to predict the presence or absence of each variable. Based on these results and domain knowledge, we selected 28 variables that provided a reasonable trade-off between quantity and completeness and that we deemed to be largely MAR.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Predicting_the_presence_of_data\">Predicting the presence of data<\/span><\/h3>\n<p>For each clinical laboratory measure, we used the scikit-learn<sup id=\"rdp-ebb-cite_ref-PedregosaScikit11_13-0\" class=\"reference\"><a href=\"#cite_note-PedregosaScikit11-13\" rel=\"external_link\">[13]<\/a><\/sup> random forest classifier, to predict whether each value would be present. Each laboratory measure was converted to a binary label vector based on whether the measure was recorded. The values of all other laboratory measures, excluding comembers of a panel, were used as the training matrix input to the random forest. This process was repeated for each laboratory test using 10-fold cross-validation. We assessed prediction accuracy by the area under the receiver operating characteristic curve (AUROC) using the trapezoidal rule.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Sampling_of_complete_cases\">Sampling of complete cases<\/span><\/h3>\n<p>To generate a set of complete cases that resembled the whole population, we randomly sampled 100,000 patients without replacement. We then matched each of these individuals to the most similar patient who had a value for each of the 28 most common laboratory tests by matching sex and finding the minimal euclidean distance of age and BMI.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Simulation_of_missing_data\">Simulation of missing data<\/span><\/h3>\n<p>Within the sampled complete cases, we selected the data for removal by four mechanisms:\n<\/p><p><b>Simulation 1<\/b>: Missing Completely at Random\nWe replaced values with NaN (indicator of missing data) at random. We repeated this procedure 10 times each for 10%, 20%, 30%, 40%, and 50% missingness, yielding 50 simulated datasets.\n<\/p><p><b>Simulation 2<\/b>: Missing at Random\nWe selected two columns (A and B) and a quartile. For the values from column A within the quartile, we randomly replaced 50% of the values from column B with NaN. We repeated the procedure for each quartile and each laboratory test combination, yielding 3024 simulated datasets.\n<\/p><p><b>Simulation 3<\/b>: Missing Not at Random\nWe selected a column and a quartile. When the column\u2019s value was in the quartile, we replaced it with NaN 50% of the time. We repeated this procedure for each of the four quartiles of each of the 28 laboratory values, generating a total 112 total simulated datasets.\n<\/p><p><b>Simulation 4<\/b>: Missingness Based on Real Data Observations\nFrom our complete-cases dataset, we matched each patient to the nearest neighbor, excluding self-matches, in the entire population based on their sex, age, and BMI. We then replaced any laboratory value in the complete cases with NaN if it was absent in the matched patient.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Imputation_of_missing_data\">Imputation of missing data<\/span><\/h3>\n<p>Using our simulated datasets (simulations 1-4), we compared 18 common imputation methods (12 representative methods are shown in the figures below) from the fancyimpute<sup id=\"rdp-ebb-cite_ref-GHFancyImpute_14-0\" class=\"reference\"><a href=\"#cite_note-GHFancyImpute-14\" rel=\"external_link\">[14]<\/a><\/sup> and the multivariate imputation by chained equations (MICE v2.30)<sup id=\"rdp-ebb-cite_ref-VanBuurenMice11_15-0\" class=\"reference\"><a href=\"#cite_note-VanBuurenMice11-15\" rel=\"external_link\">[15]<\/a><\/sup> packages. Multimedia Appendix 1 (table) shows a full list of imputation methods and the parameters used for each.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Results\">Results<\/span><\/h2>\n<p>Our first step was to select a subset of the 143 laboratory measures for which imputation would be a reasonable approach. We began by ranking the clinical laboratory measures in descending order by the number of patients who had an observed value for that test. For each ranked laboratory test, we plotted the percentage of individuals missing a value, as well as the percentage of complete cases when that given test was joined with all the tests with lower ranks (i.e., less missingness). These plots showed that the best trade-off between quantity of data and completeness was between 20 and 30 variables (Figure 2, part A). Beyond the 30 most common laboratory tests, the number of complete cases rapidly approached zero.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig2_Beaulieu-JonesJMIRMedInfo2018_6-1.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"16abf95f637fb2fcff60867699567fc3\"><img alt=\"Fig2 Beaulieu-JonesJMIRMedInfo2018 6-1.png\" src=\"https:\/\/www.limswiki.org\/images\/3\/34\/Fig2_Beaulieu-JonesJMIRMedInfo2018_6-1.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 2:<\/b> Summary of missing data across 143 clinical laboratory measures. <b>(A)<\/b> After ranking the clinical laboratory measures by the number of total results, the percentage of patients missing a result for each test was plotted (red points). At each rank, the percentage of complete cases for all tests of equal or lower rank were also plotted (blue points). Only variables with a rank \u226475 are shown. The vertical bar indicates the 28 tests that were selected for further analysis. <b>(B)<\/b> The full distribution of patient median ages is shown in blue, and the fraction of individuals in each age group that had a complete set of observations for tests 1-28 are shown in red. <b>(C)<\/b> Within the 28 laboratory tests that were selected for imputation analyses, the mean number of missing tests is depicted as a function of age. <b>(D)<\/b> Within the 28 laboratory tests that were selected for imputation, the mean number of missing tests is depicted as a function of body mass index (BMI). <b>(E)<\/b> Accuracy of a random forest predicting the presence or absence of all 143 laboratory tests. AUROC: area under the receiver operating characteristic curve. <b>(F)<\/b> Accuracy of a random forest predicting the presence or absence of the top 28 laboratory tests, by Logical Observation Identifiers Names and Codes (LOINC).<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>As age, sex, and BMI have a considerable impact on what clinical laboratory measures are collected, we evaluated the relationship between missingness and these covariates (Figure 2, parts B-D). We also used a random forest approach to predict the presence or absence of each measure based on the values of the other observed measures. MCAR data are not predictable, resulting in AUROCs near 0.5. Only 38 of the 143 laboratory tests had AUROCs less than 0.55 (Figure 2, part E). Very high AUROCs are most consistent with data that are MAR. For the top 30 candidate clinical laboratory measures based on the number of complete cases, the mean AUROC was 0.82. This suggested that the observed data could explain much of the mechanism responsible for the missing data within this set. We ultimately decided not to include the 29th-ranked laboratory test, specific gravity of urine (2965-2), since it had an AUROC of only 0.69 and is typically used for screening only within urology or nephrology departments (RV Levy, MD, personal conversation, June 2017). We included the lipid measures (ranks 25-28) since they had AUROC values near 0.82 and are recommended for screening of patients depending on age, sex, and BMI.<sup id=\"rdp-ebb-cite_ref-HelfandScreening08_16-0\" class=\"reference\"><a href=\"#cite_note-HelfandScreening08-16\" rel=\"external_link\">[16]<\/a><\/sup> Our data confirmed that age, sex, and BMI all predicted the presence of lipid measures (Multimedia Appendix 1, fig 1A-B).\n<\/p><p>To assess the accuracy of imputation methods, we required known values to compare with imputed values. Thus, we restricted our analysis to a subset of patients who were complete cases for the 28 selected variables (Table 1).<sup id=\"rdp-ebb-cite_ref-McDonaldLOINC03_17-0\" class=\"reference\"><a href=\"#cite_note-McDonaldLOINC03-17\" rel=\"external_link\">[17]<\/a><\/sup> Since the characteristics of this subset differed from those of the broader population (Figure 2, parts B-D), we used sampling and k-nearest neighbors (KNN) matching to generate a subset of the complete cases that better resembled the overall population. We then simulated missing data within this set by four mechanisms: MCAR, MAR, MNAR, and realistic patterns based on the original data.\n<\/p>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table class=\"wikitable\" border=\"1\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" colspan=\"2\"><b>Table 1.<\/b> Logical Observation Identifiers Names and Codes (LOINC) and descriptions of the most frequently ordered clinical laboratory measurements. The assays are ranked from the most common to the least.<br \/> <br \/><sup>a<\/sup>HDL: high-density lipoprotein; <sup>b<\/sup>LDL: low-density lipoprotein\n<\/td><\/tr>\n<tr>\n<th style=\"padding-left:10px; padding-right:10px;\">LOINC\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">Description\n<\/th><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">718-7\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Hemoglobin [Mass\/volume] in Blood\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">4544-3\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Hematocrit [Volume Fraction] of Blood by Automated count\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">787-2\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Erythrocyte mean corpuscular volume [Entitic volume] by Automated count\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">786-4\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Erythrocyte mean corpuscular hemoglobin concentration [Mass\/volume] by Automated count\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">785-6\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Erythrocyte mean corpuscular hemoglobin [Entitic mass] by Automated count\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">6690-2\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Leukocytes [#\/volume] in Blood by Automated count\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">789-8\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Erythrocytes [#\/volume] in Blood by Automated count\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">788-0\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Erythrocyte distribution width [Ratio] by Automated count\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">32623-1\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Platelet mean volume [Entitic volume] in Blood by Automated count\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">777-3\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Platelets [#\/volume] in Blood by Automated count\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2345-7\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Glucose [Mass\/volume] in Serum or Plasma\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2160-0\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Creatinine [Mass\/volume] in Serum or Plasma\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2823-3\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Potassium [Moles\/volume] in Serum or Plasma\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">3094-0\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Urea nitrogen [Mass\/volume] in Serum or Plasma\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2951-2\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Sodium [Moles\/volume] in Serum or Plasma\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2075-0\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Chloride [Moles\/volume] in Serum or Plasma\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2028-9\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Carbon dioxide, total [Moles\/volume] in Serum or Plasma\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">17861-6\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Calcium [Mass\/volume] in Serum or Plasma\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">1743-4\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Alanine aminotransferase [Enzymatic activity\/volume] in Serum or Plasma by With P-5'-P\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">30239-8\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Aspartate aminotransferase [Enzymatic activity\/volume] in Serum or Plasma by With P-5'-P\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">1975-2\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Bilirubin.total [Mass\/volume] in Serum or Plasma\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2885-2\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Protein [Mass\/volume] in Serum or Plasma\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">10466-1\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Anion gap 3 in Serum or Plasma\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">751-8\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Neutrophils [#\/volume] in Blood by Automated count\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2093-3\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Cholesterol [Mass\/volume] in Serum or Plasma\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2571-8\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Triglyceride [Mass\/volume] in Serum or Plasma\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">2085-9\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Cholesterol in HDL<sup>a<\/sup> [Mass\/volume] in Serum or Plasma\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">13457-7\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Cholesterol in LDL<sup>b<\/sup> [Mass\/volume] in Serum or Plasma by calculation\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>We next evaluated our ability to predict the presence of each value in the simulated datasets. These simulations confirmed that our MCAR simulation had a low AUROC (Figure 3, part A). The MAR data (Figure 3, part B) and MNAR data (Figure 3, part C) were often well predicted, particularly for the MAR data and when data were missing from the tails of distributions. The AUROCs rarely exceeded 0.75 in the MNAR simulations, while values above 0.75 were typical in the MAR simulations. This provided additional support for our decision to restrict our focus to the top 28 laboratory measures, since they all had AUROCs between 0.9 and 0.75, which was outside the range of MNAR simulations (Figure 2, part F and Figure 3, part C).\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig3_Beaulieu-JonesJMIRMedInfo2018_6-1.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"47331e379ce1b735697f3a64920c13a1\"><img alt=\"Fig3 Beaulieu-JonesJMIRMedInfo2018 6-1.png\" src=\"https:\/\/www.limswiki.org\/images\/f\/f3\/Fig3_Beaulieu-JonesJMIRMedInfo2018_6-1.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 3:<\/b> Area under the receiver operating characteristic curve (AUROC) of a random forest predicting whether data will be present or missing. <b>(A)<\/b> Missing completely at random simulation. <b>(B)<\/b> Missing at random simulation. <b>(C)<\/b> Missing not at random simulation.<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>We chose to test the accuracy of imputation for several methods fromtwo popular and freely available libraries: the MICE package for R and the fancyimpute library for Python. We first applied each of these methods across simulations 1 to 3. For each combination, Figure 4 depicts the overall root mean square errors. Multimedia Appendix 1 (Supplemental Table and Figures 3-21) shows a breakdown of all the methods and parameters.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig4_Beaulieu-JonesJMIRMedInfo2018_6-1.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"78970db3666825343d8a950d4a9c12a7\"><img alt=\"Fig4 Beaulieu-JonesJMIRMedInfo2018 6-1.png\" src=\"https:\/\/www.limswiki.org\/images\/3\/32\/Fig4_Beaulieu-JonesJMIRMedInfo2018_6-1.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 4:<\/b> Imputation accuracy measured by root mean square error (RMSE) across simulations 1-3. <b>(A)<\/b> Missing completely at random (MCAR). <b>(B)<\/b> Missing at random (MAR). <b>(C)<\/b> Missing not at random (MNAR). FI: fancyimpute; KNN: k-nearest neighbors; MICE: Multivariate Imputation by Chained Equations; pmm: predictive mean matching; RF: random forest; SVD: singular value decomposition.<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>We next measured imputation accuracy based on the patterns of missingness that we observed in the real data (Figure 5). The main difference compared with simulations 1 to 3 was lower error for some of the deterministic methods (mean, median, and KNN). It is worth mentioning that the error was highly dependent on the variable that was being imputed. Specifically, for the fancyimpute MICE predictive mean matching (pmm) method, multicollinearity within some of the variables caused convergence failures that led to extremely large errors (Figure 5, method MICE pmm [FI]). These factors were relatively easy to address in the R package MICE pmm method by adjusting the predictor matrix.<sup id=\"rdp-ebb-cite_ref-VanBuurenMice11_15-1\" class=\"reference\"><a href=\"#cite_note-VanBuurenMice11-15\" rel=\"external_link\">[15]<\/a><\/sup>\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig5_Beaulieu-JonesJMIRMedInfo2018_6-1.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"f52fa872f8fda1bea4a0f1bed406f161\"><img alt=\"Fig5 Beaulieu-JonesJMIRMedInfo2018 6-1.png\" src=\"https:\/\/www.limswiki.org\/images\/7\/78\/Fig5_Beaulieu-JonesJMIRMedInfo2018_6-1.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 5:<\/b> Imputation root mean square error (RMSE) for a subset of 10,000 patients from simulation 4. A total of 12 imputation methods were tested (x-axis), and each color corresponds to a Logical Observation Identifiers Names and Codes (LOINC) code. The black line shows the theoretical error from random sampling. FI: fancyimpute; KNN: k-nearest neighbors; MICE: Multivariate Imputation by Chained Equations; pmm: predictive mean matching; RF: random forest; SVD: singular value decomposition.<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>In addition to evaluating the accuracy of imputation, it is also important to estimate the uncertainty associated with imputation. One approach to address this is multiple imputation, where each data point is imputed multiple times using a nondeterministic method. To determine whether each method properly captured the true uncertainty of the data, we compared the error between an imputed dataset and the observed data versus the error between two sets of imputed values for each method (Figure 6). If these errors are equal, then multiple imputation is likely producing good estimates of uncertainty. If, however, the error between two imputed datasets is less than that between each imputed dataset and the known values, then the imputation method is likely underestimating the variance.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig6_Beaulieu-JonesJMIRMedInfo2018_6-1.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"99dab32e59de9e4701bc5f85cab9dc75\"><img alt=\"Fig6 Beaulieu-JonesJMIRMedInfo2018 6-1.png\" src=\"https:\/\/www.limswiki.org\/images\/d\/df\/Fig6_Beaulieu-JonesJMIRMedInfo2018_6-1.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 6:<\/b> Assessment of multiple imputation for each method. Using simulation 4, missing values were imputed multiple times with each method. The x-axes show the root mean square error (RMSE) between the imputed data and the observed values. The y-axes show the RMSE between multiple imputations of the same data. The axis scales vary between panels to better show the range of variation. The laboratory tests are indicated by the color of the points. The black diagonal line represents unity (y=x). Panels are ordered by each method\u2019s mean deviation (MD) from unity, indicated in the top left corner of each panel. In the last 7 panels, the unity line is not visible because the variation between multiple imputations was close to zero. FI: fancyimpute; KNN: k-nearest neighbors; MICE: Multivariate Imputation by Chained Equations; pmm: predictive mean matching; RF: random forest; SVD: singular value decomposition.<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>Our results (Figure 6) demonstrate that many of the imputation methods are not suitable for multiple imputation. Of the methods that had the lowest error in the MCAR, MAR, and MNAR simulations, we found three (softImpute, MICE col [fancyimpute], MICE norm.pred [R]) to have minimal variation between imputations. This was also true of KNN, singular value decomposition (SVD), mean, and median imputation. Only three methods (random sampling, MICE norm [R], and MICE pmm [R]) seemed to have similar error between the multiple imputations and the observed data and thus appear to be unbiased. The latter two had very similar performances and are the best candidates for multiple imputation. Two methods had intermediate performance. MICE random forest (R) was similar to several other MICE methods in terms of error relative to the observed data, but it produced slightly less variation between each imputed dataset. This seemed to affect some variables more than others but there was no obvious pattern. The MICE pmm (fancyimpute) was not deterministic but it did seem to achieve low error at the expense of increased bias. In this case, the variables that could be imputed with the lowest error also seemed to have the most bias. Since this method claims to be a reimplementation of the MICE pmm (R) method, this may be due to multicollinearity among the variables that could not easily be accounted for, as there was no simple way to alter the predictor matrix.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Discussion\">Discussion<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"Principal_results\">Principal results<\/span><\/h3>\n<p>It is not possible, or even desirable, to choose \u201cthe best\u201d imputation method. There are many considerations that may not be generalizable between different sets of data; however, we can draw some general conclusions about how different methods compare in terms of error, bias, complexity, and difficulty of implementation. Based on our results, there seem to be three broad categories of methods.\n<\/p><p>The first category is the simple deterministic methods. These include mean or median imputation and KNN. While easy to implement, mean or median imputation may lead to severe bias and large errors if the unobserved data are more likely to come from the tails of the observed distribution (Figure 4, parts A-C, methods mean, median, and KNN). This will also cause the variance of the distribution to be underestimated if more than a small fraction of the data is missing. Since these methods are deterministic, they are also not suitable for multiple imputation (Figure 6, bottom row).\n<\/p><p>KNN is a popular choice for imputation that has been shown to perform very well for some types of data<sup id=\"rdp-ebb-cite_ref-BerettaNearest16_18-0\" class=\"reference\"><a href=\"#cite_note-BerettaNearest16-18\" rel=\"external_link\">[18]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-TroyanskayaMissing01_19-0\" class=\"reference\"><a href=\"#cite_note-TroyanskayaMissing01-19\" rel=\"external_link\">[19]<\/a><\/sup>, but it was not particularly well suited for our data, regardless of the choice of k. This may be due to issues of data dimensionality<sup id=\"rdp-ebb-cite_ref-PestovIsThe13_20-0\" class=\"reference\"><a href=\"#cite_note-PestovIsThe13-20\" rel=\"external_link\">[20]<\/a><\/sup> or to individuals not falling into well-separated groups based on their clinical laboratory results. This method is also not suitable for large datasets, since a distance matrix for all pairs of individuals is stored in memory during computation, and the size of the distance matrix scales with n<sup>2<\/sup>.\n<\/p><p>The second category of algorithms could be called the sophisticated deterministic methods. These include SVD, softImpute, MICE col, and MICE norm.pred. SVD performed poorly compared with its counterparts and sometimes produced errors greater than simple random sampling (Figure 5, method SVD). The reasons for this are not clear, but we cannot recommend this method. SoftImpute, MICE col, and MICE norm.pred were among the lowest-error methods in all of our simulations (Figure 5, methods MICE col and norm.pred). The main limitation of these methods is that they cannot be used for multiple imputation (Figure 6, middle row).\n<\/p><p>The third broad category of algorithms comprises the stochastic methods, which included random sampling and most of the remaining methods in the MICE library. Random sampling almost always produced the highest error (Figure 4 and Figure 5, method random sample), but it has the advantage of being easy to implement and it requires no parameter selection. The MICE methods based on pmm, random forests, and Bayesian linear regression tended to perform similarly in terms of error in most of our simulations (Figure 4 and Figure 5, methods MICE pmm, RF, and norm).\n<\/p><p>Imputation methods that involve stochasticity allow for a fundamentally different type of analysis called multiple imputation. In this paradigm, multiple imputed datasets (a minimum of three and often 10-20 depending on the percentage of missing data)<sup id=\"rdp-ebb-cite_ref-StuartMultiple_21-0\" class=\"reference\"><a href=\"#cite_note-StuartMultiple-21\" rel=\"external_link\">[21]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-WhiteMultiple11_22-0\" class=\"reference\"><a href=\"#cite_note-WhiteMultiple11-22\" rel=\"external_link\">[22]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-BodnerWhat08_23-0\" class=\"reference\"><a href=\"#cite_note-BodnerWhat08-23\" rel=\"external_link\">[23]<\/a><\/sup> are generated, and each is analyzed in the same way. At the end of all downstream analyses, the results are then compared. Typically, the ultimate result of interest is supported by a <i>P<\/i> value, a regression coefficient, an odds ratio, etc. In the case of a multiply imputed dataset, the researcher will have several output statistics that can be used to estimate a confidence interval for the result.\n<\/p><p>Multiple imputation has been gaining traction recently, and the MICE package has become one of the most popular choices for implementing this procedure. This package is powerful and very well documented<sup id=\"rdp-ebb-cite_ref-VanBuurenMice11_15-2\" class=\"reference\"><a href=\"#cite_note-VanBuurenMice11-15\" rel=\"external_link\">[15]<\/a><\/sup> but, like all methods for imputation, caution must be exercised. In MICE, each variable is imputed one by one. This entire process is then repeated for a number of iterations such that the values imputed in one iteration can update the estimates for the next iteration. The result is a chain of imputed datasets, and this entire process is typically performed in parallel so that multiple chains are generated.\n<\/p><p>In MICE, several choices must be made. The first obvious choice is the imputation method (i.e., equation). Many methods are available in the base package, additional methods can be added from other packages<sup id=\"rdp-ebb-cite_ref-CRANMiceAdds_24-0\" class=\"reference\"><a href=\"#cite_note-CRANMiceAdds-24\" rel=\"external_link\">[24]<\/a><\/sup>, and users can even define their own. We thoroughly evaluated three methods in the context of our dataset: pmm, Bayesian linear regression (norm), and random forest.\n<\/p><p>The pmm is the default choice, and it can be used on a mixture of numeric and categorical variables. We found pmm to have a good trade-off between error and bias, but for our dataset it was critical to remove several variables from the predictor matrix due to strong correlations (<i>R<\/i>>.85) and multicollinearity. Bayesian regression performed similarly but was less sensitive to these issues. If a dataset contains only numeric values, Bayesian regression may be a safer option. Random forest tended to produce results that were slightly biased for a subset of the variables without an appreciable reduction in error. Aside from random sampling, none of the other methods we evaluated were suitable for multiple imputation (Figure 6).\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Conclusions\">Conclusions<\/span><\/h2>\n<p>Many factors must be considered when analyzing a dataset with missing values. This starts by determining whether each variable should be considered at all. Two good reasons to reject a variable are if it has too many missing values or if it is likely to be MNAR. If a variable is deemed to be MNAR, it may still be possible to impute, but the mechanism of missingness should be explicitly modeled, and a sensitivity analysis is recommended to assess how much impact this could have on the final results.<sup id=\"rdp-ebb-cite_ref-H.C3.A9raud-BousquetPractical12_25-0\" class=\"reference\"><a href=\"#cite_note-H.C3.A9raud-BousquetPractical12-25\" rel=\"external_link\">[25]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-CarpenterSensit07_26-0\" class=\"reference\"><a href=\"#cite_note-CarpenterSensit07-26\" rel=\"external_link\">[26]<\/a><\/sup> While a statistical model of the mechanism of missingness is useful, there is no substitute for a deep familiarity with the data at hand and how they were generated.\n<\/p><p>Having selected the data, one must select an imputation method. Ideally, several methods should be tested in a realistic setting. Great care should be taken to construct a set of complete data that closely resemble all of the relevant characteristics of the data that one wishes to impute. Similar care should then be taken to remove some of these data in ways that closely resemble the observed patterns of missingness. If this is not feasible, one may also simulate a variety of datasets representing a range of possible data structures and missingness mechanisms. Any available imputation methods can then be applied to the simulated data, and error between the imputed data and their known values provide a metric of performance.\n<\/p><p>While the minimization of error is an important goal, a singular focus on this objective is likely to lead to bias. For each missing value, it is also important to estimate the uncertainty associated with it. This can be achieved by multiple imputation using an algorithm that incorporates stochastic processes. Multiple imputation has become the field standard because it provides confidence intervals for the results of downstream analyses. One should not naively assume that any stochastic process is free of bias. It is important to check that multiple imputation is providing variability that corresponds to the actual uncertainty of the imputed values using a set of simulated data.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Acknowledgements\">Acknowledgements<\/span><\/h2>\n<p>We thank Dr. Casey S. Greene (University of Pennsylvania) for his helpful discussions. We also thank Dr. Rebecca V. Levy (Geisinger) for providing expert clinical domain knowledge.\n<\/p><p>This work was supported by the Commonwealth Universal Research Enhancement Program grant from the Pennsylvania Department of Health. BBJ and JM were also supported by US National Institutes of Health grants AI116794 and LM010098 to JM.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Authors.27_contributions\">Authors' contributions<\/span><\/h3>\n<p>BBJ, JM, SAP, and CRB conceived of the study. DRL and JWS performed data processing. BBJ and CRB performed analyses. BBJ, SAP, and CRB wrote the manuscript, and all authors revised and approved the final manuscript.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Conflicts_of_interest\">Conflicts of interest<\/span><\/h2>\n<p>None declared.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Abbreviations\">Abbreviations<\/span><\/h2>\n<p><b>AUROC<\/b>: area under the receiver operating characteristic curve\n<\/p><p><b>BMI<\/b>: body mass index\n<\/p><p><b>EHR<\/b>: electronic health record\n<\/p><p><b>KNN<\/b>: k-nearest neighbors\n<\/p><p><b>LOINC<\/b>: Logical Observation Identifiers Names and Codes\n<\/p><p><b>MAR<\/b>: missing at random\n<\/p><p><b>MCAR<\/b>: missing completely at random\n<\/p><p><b>MICE<\/b>: Multivariate Imputation by Chained Equations\n<\/p><p><b>MNAR<\/b>: missing not at random\n<\/p><p><b>pmm<\/b>: predictive mean matching\n<\/p><p><b>SVD<\/b>: singular value decomposition\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Additional_files\">Additional files<\/span><\/h2>\n<p>Supplemental table and figures: <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/medinform.jmir.org\/article\/downloadSuppFile\/8960\/59747\" target=\"_blank\">PDF File (Adobe PDF File), 4MB<\/a>\n<\/p>\n<h2><span class=\"mw-headline\" id=\"References\">References<\/span><\/h2>\n<div class=\"reflist references-column-width\" style=\"-moz-column-width: 30em; -webkit-column-width: 30em; column-width: 30em; list-style-type: decimal;\">\n<ol class=\"references\">\n<li id=\"cite_note-SteinbrookHealth09-1\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-SteinbrookHealth09_1-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Steinbrook, R.. \"Health Care and the American Recovery and Reinvestment Act\". <i>New England Journal of Medicine<\/i> <b>360<\/b> (11): 1057\u20131060. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1056%2FNEJMp0900665\" target=\"_blank\">10.1056\/NEJMp0900665<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/19224738\" target=\"_blank\">19224738<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Health+Care+and+the+American+Recovery+and+Reinvestment+Act&rft.jtitle=New+England+Journal+of+Medicine&rft.aulast=Steinbrook%2C+R.&rft.au=Steinbrook%2C+R.&rft.volume=360&rft.issue=11&rft.pages=1057%E2%80%931060&rft_id=info:doi\/10.1056%2FNEJMp0900665&rft_id=info:pmid\/19224738&rfr_id=info:sid\/en.wikipedia.org:Journal:Characterizing_and_managing_missing_structured_data_in_electronic_health_records:_Data_analysis\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-FlintoftDisease14-2\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-FlintoftDisease14_2-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Flintoft, L.. \"Disease genetics: Phenome-wide association studies go large\". <i>Nature Reviews Genetics<\/i> <b>15<\/b> (1): 2. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1038%2Fnrg3637\" target=\"_blank\">10.1038\/nrg3637<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/24322724\" target=\"_blank\">24322724<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Disease+genetics%3A+Phenome-wide+association+studies+go+large&rft.jtitle=Nature+Reviews+Genetics&rft.aulast=Flintoft%2C+L.&rft.au=Flintoft%2C+L.&rft.volume=15&rft.issue=1&rft.pages=2&rft_id=info:doi\/10.1038%2Fnrg3637&rft_id=info:pmid\/24322724&rfr_id=info:sid\/en.wikipedia.org:Journal:Characterizing_and_managing_missing_structured_data_in_electronic_health_records:_Data_analysis\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WellsStrat13-3\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WellsStrat13_3-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Wells, B.J.; Chagin, K.M.; Nowacki, A.S.; Kattan, M.W.. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4371484\" target=\"_blank\">\"Strategies for handling missing data in electronic health record derived data\"<\/a>. <i>EGEMS<\/i> <b>1<\/b> (3): 1035. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.13063%2F2327-9214.1035\" target=\"_blank\">10.13063\/2327-9214.1035<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4371484\/\" target=\"_blank\">PMC4371484<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/25848578\" target=\"_blank\">25848578<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4371484\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4371484<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Strategies+for+handling+missing+data+in+electronic+health+record+derived+data&rft.jtitle=EGEMS&rft.aulast=Wells%2C+B.J.%3B+Chagin%2C+K.M.%3B+Nowacki%2C+A.S.%3B+Kattan%2C+M.W.&rft.au=Wells%2C+B.J.%3B+Chagin%2C+K.M.%3B+Nowacki%2C+A.S.%3B+Kattan%2C+M.W.&rft.volume=1&rft.issue=3&rft.pages=1035&rft_id=info:doi\/10.13063%2F2327-9214.1035&rft_id=info:pmc\/PMC4371484&rft_id=info:pmid\/25848578&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4371484&rfr_id=info:sid\/en.wikipedia.org:Journal:Characterizing_and_managing_missing_structured_data_in_electronic_health_records:_Data_analysis\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BounthavongApproach15-4\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BounthavongApproach15_4-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Bounthavong, M.; Watanabe, J.H.; Sullivan, K.M.. \"Approach to addressing missing data for electronic medical records and pharmacy claims data research\". <i>Pharmacotherapy<\/i> <b>35<\/b> (4): 380\u20137. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1002%2Fphar.1569\" target=\"_blank\">10.1002\/phar.1569<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/25884526\" target=\"_blank\">25884526<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Approach+to+addressing+missing+data+for+electronic+medical+records+and+pharmacy+claims+data+research&rft.jtitle=Pharmacotherapy&rft.aulast=Bounthavong%2C+M.%3B+Watanabe%2C+J.H.%3B+Sullivan%2C+K.M.&rft.au=Bounthavong%2C+M.%3B+Watanabe%2C+J.H.%3B+Sullivan%2C+K.M.&rft.volume=35&rft.issue=4&rft.pages=380%E2%80%937&rft_id=info:doi\/10.1002%2Fphar.1569&rft_id=info:pmid\/25884526&rfr_id=info:sid\/en.wikipedia.org:Journal:Characterizing_and_managing_missing_structured_data_in_electronic_health_records:_Data_analysis\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BhaskaranWhat14-5\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-BhaskaranWhat14_5-0\" rel=\"external_link\">5.0<\/a><\/sup> <sup><a href=\"#cite_ref-BhaskaranWhat14_5-1\" rel=\"external_link\">5.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Bhaskaran, K.; Smeeth, L.. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4121561\" target=\"_blank\">\"What is the difference between missing completely at random and missing at random?\"<\/a>. <i>International Journal of Epidemiology<\/i> <b>43<\/b> (4): 1336-9. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1093%2Fije%2Fdyu080\" target=\"_blank\">10.1093\/ije\/dyu080<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4121561\/\" target=\"_blank\">PMC4121561<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/24706730\" target=\"_blank\">24706730<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4121561\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4121561<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=What+is+the+difference+between+missing+completely+at+random+and+missing+at+random%3F&rft.jtitle=International+Journal+of+Epidemiology&rft.aulast=Bhaskaran%2C+K.%3B+Smeeth%2C+L.&rft.au=Bhaskaran%2C+K.%3B+Smeeth%2C+L.&rft.volume=43&rft.issue=4&rft.pages=1336-9&rft_id=info:doi\/10.1093%2Fije%2Fdyu080&rft_id=info:pmc\/PMC4121561&rft_id=info:pmid\/24706730&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4121561&rfr_id=info:sid\/en.wikipedia.org:Journal:Characterizing_and_managing_missing_structured_data_in_electronic_health_records:_Data_analysis\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-RubinInference76-6\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-RubinInference76_6-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Rubin, D.B.. \"Inference and missing data\". <i>Biometrika<\/i> <b>63<\/b> (3): 581\u2013592. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1093%2Fbiomet%2F63.3.581\" target=\"_blank\">10.1093\/biomet\/63.3.581<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Inference+and+missing+data&rft.jtitle=Biometrika&rft.aulast=Rubin%2C+D.B.&rft.au=Rubin%2C+D.B.&rft.volume=63&rft.issue=3&rft.pages=581%E2%80%93592&rft_id=info:doi\/10.1093%2Fbiomet%2F63.3.581&rfr_id=info:sid\/en.wikipedia.org:Journal:Characterizing_and_managing_missing_structured_data_in_electronic_health_records:_Data_analysis\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-J.C3.B6rnstenAMeta07-7\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-J.C3.B6rnstenAMeta07_7-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">J\u00f6rnsten, R.; Ouyang, M.; Wang, H.Y.. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC1852325\" target=\"_blank\">\"A meta-data based method for DNA microarray imputation\"<\/a>. <i>BMC Bioinformatics<\/i> <b>8<\/b>: 109. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1186%2F1471-2105-8-109\" target=\"_blank\">10.1186\/1471-2105-8-109<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC1852325\/\" target=\"_blank\">PMC1852325<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/17394658\" target=\"_blank\">17394658<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC1852325\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC1852325<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+meta-data+based+method+for+DNA+microarray+imputation&rft.jtitle=BMC+Bioinformatics&rft.aulast=J%C3%B6rnsten%2C+R.%3B+Ouyang%2C+M.%3B+Wang%2C+H.Y.&rft.au=J%C3%B6rnsten%2C+R.%3B+Ouyang%2C+M.%3B+Wang%2C+H.Y.&rft.volume=8&rft.pages=109&rft_id=info:doi\/10.1186%2F1471-2105-8-109&rft_id=info:pmc\/PMC1852325&rft_id=info:pmid\/17394658&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC1852325&rfr_id=info:sid\/en.wikipedia.org:Journal:Characterizing_and_managing_missing_structured_data_in_electronic_health_records:_Data_analysis\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-Beaulieu-JonesMissing17-8\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-Beaulieu-JonesMissing17_8-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Beaulieu-Jones, B.K.; Moore, J.H.. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5144587\" target=\"_blank\">\"Missing data imputation in the electronic health record using deeply learned autoencoders\"<\/a>. <i>Pacific Symposium on Biocomputing<\/i> <b>22<\/b>: 207-218. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1142%2F9789813207813_0021\" target=\"_blank\">10.1142\/9789813207813_0021<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC5144587\/\" target=\"_blank\">PMC5144587<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/27896976\" target=\"_blank\">27896976<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5144587\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5144587<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Missing+data+imputation+in+the+electronic+health+record+using+deeply+learned+autoencoders&rft.jtitle=Pacific+Symposium+on+Biocomputing&rft.aulast=Beaulieu-Jones%2C+B.K.%3B+Moore%2C+J.H.&rft.au=Beaulieu-Jones%2C+B.K.%3B+Moore%2C+J.H.&rft.volume=22&rft.pages=207-218&rft_id=info:doi\/10.1142%2F9789813207813_0021&rft_id=info:pmc\/PMC5144587&rft_id=info:pmid\/27896976&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC5144587&rfr_id=info:sid\/en.wikipedia.org:Journal:Characterizing_and_managing_missing_structured_data_in_electronic_health_records:_Data_analysis\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-AllisonMissing02-9\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-AllisonMissing02_9-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Allison, P.D. (2002). <i>Missing Data<\/i>. Quantitative Applications in the Social Sciences. <b>136<\/b>. SAGE Publications. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.4135%2F9781412985079\" target=\"_blank\">10.4135\/9781412985079<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a> 9781412985079.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=book&rft.btitle=Missing+Data&rft.aulast=Allison%2C+P.D.&rft.au=Allison%2C+P.D.&rft.date=2002&rft.series=Quantitative+Applications+in+the+Social+Sciences&rft.volume=136&rft.pub=SAGE+Publications&rft_id=info:doi\/10.4135%2F9781412985079&rft.isbn=9781412985079&rfr_id=info:sid\/en.wikipedia.org:Journal:Characterizing_and_managing_missing_structured_data_in_electronic_health_records:_Data_analysis\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GHImputation-10\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-GHImputation_10-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/github.com\/EpistasisLab\/imputation\" target=\"_blank\">\"EpistasisLab\/imputation\"<\/a>. GitHub, Inc<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/github.com\/EpistasisLab\/imputation\" target=\"_blank\">https:\/\/github.com\/EpistasisLab\/imputation<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 01 December 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=EpistasisLab%2Fimputation&rft.atitle=&rft.pub=GitHub%2C+Inc&rft_id=https%3A%2F%2Fgithub.com%2FEpistasisLab%2Fimputation&rfr_id=info:sid\/en.wikipedia.org:Journal:Characterizing_and_managing_missing_structured_data_in_electronic_health_records:_Data_analysis\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-Beaulieu-JonesRepro17-11\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-Beaulieu-JonesRepro17_11-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Beaulieu-Jones, B.K.; Greene, C.S.. \"Reproducibility of computational workflows is automated using continuous analysis\". <i>Nature Biotechnology<\/i> <b>35<\/b> (4): 342-346. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1038%2Fnbt.3780\" target=\"_blank\">10.1038\/nbt.3780<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/28288103\" target=\"_blank\">28288103<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Reproducibility+of+computational+workflows+is+automated+using+continuous+analysis&rft.jtitle=Nature+Biotechnology&rft.aulast=Beaulieu-Jones%2C+B.K.%3B+Greene%2C+C.S.&rft.au=Beaulieu-Jones%2C+B.K.%3B+Greene%2C+C.S.&rft.volume=35&rft.issue=4&rft.pages=342-346&rft_id=info:doi\/10.1038%2Fnbt.3780&rft_id=info:pmid\/28288103&rfr_id=info:sid\/en.wikipedia.org:Journal:Characterizing_and_managing_missing_structured_data_in_electronic_health_records:_Data_analysis\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-DockerImputation-12\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-DockerImputation_12-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/hub.docker.com\/r\/brettbj\/ehr-imputation\/\" target=\"_blank\">\"brettbj\/ehr-imputation\"<\/a>. Docker, Inc<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/hub.docker.com\/r\/brettbj\/ehr-imputation\/\" target=\"_blank\">https:\/\/hub.docker.com\/r\/brettbj\/ehr-imputation\/<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 02 December 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=brettbj%2Fehr-imputation&rft.atitle=&rft.pub=Docker%2C+Inc&rft_id=https%3A%2F%2Fhub.docker.com%2Fr%2Fbrettbj%2Fehr-imputation%2F&rfr_id=info:sid\/en.wikipedia.org:Journal:Characterizing_and_managing_missing_structured_data_in_electronic_health_records:_Data_analysis\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-PedregosaScikit11-13\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-PedregosaScikit11_13-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Pedregosa, F.; Varoquaux, G.; Gramfort, A. et al. (2011). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.jmlr.org\/papers\/v12\/pedregosa11a.html\" target=\"_blank\">\"Scikit-learn: Machine Learning in Python\"<\/a>. <i>Journal of Machine Learning Research<\/i> <b>12<\/b> (10): 2825\u20132830<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.jmlr.org\/papers\/v12\/pedregosa11a.html\" target=\"_blank\">http:\/\/www.jmlr.org\/papers\/v12\/pedregosa11a.html<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Scikit-learn%3A+Machine+Learning+in+Python&rft.jtitle=Journal+of+Machine+Learning+Research&rft.aulast=Pedregosa%2C+F.%3B+Varoquaux%2C+G.%3B+Gramfort%2C+A.+et+al.&rft.au=Pedregosa%2C+F.%3B+Varoquaux%2C+G.%3B+Gramfort%2C+A.+et+al.&rft.date=2011&rft.volume=12&rft.issue=10&rft.pages=2825%E2%80%932830&rft_id=http%3A%2F%2Fwww.jmlr.org%2Fpapers%2Fv12%2Fpedregosa11a.html&rfr_id=info:sid\/en.wikipedia.org:Journal:Characterizing_and_managing_missing_structured_data_in_electronic_health_records:_Data_analysis\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GHFancyImpute-14\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-GHFancyImpute_14-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/github.com\/iskandr\/fancyimpute\" target=\"_blank\">\"iskandr\/fancyimpute\"<\/a>. GitHub, Inc<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/github.com\/iskandr\/fancyimpute\" target=\"_blank\">https:\/\/github.com\/iskandr\/fancyimpute<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 15 February 2018<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=iskandr%2Ffancyimpute&rft.atitle=&rft.pub=GitHub%2C+Inc&rft_id=https%3A%2F%2Fgithub.com%2Fiskandr%2Ffancyimpute&rfr_id=info:sid\/en.wikipedia.org:Journal:Characterizing_and_managing_missing_structured_data_in_electronic_health_records:_Data_analysis\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-VanBuurenMice11-15\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-VanBuurenMice11_15-0\" rel=\"external_link\">15.0<\/a><\/sup> <sup><a href=\"#cite_ref-VanBuurenMice11_15-1\" rel=\"external_link\">15.1<\/a><\/sup> <sup><a href=\"#cite_ref-VanBuurenMice11_15-2\" rel=\"external_link\">15.2<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">van Buuren, S.; Groothuis-Oudshoorn, K. (2011). \"mice: Multivariate Imputation by Chained Equations in R\". <i>Journal of Statistical Software<\/i> <b>45<\/b> (3). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.18637%2Fjss.v045.i03\" target=\"_blank\">10.18637\/jss.v045.i03<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=mice%3A+Multivariate+Imputation+by+Chained+Equations+in+R&rft.jtitle=Journal+of+Statistical+Software&rft.aulast=van+Buuren%2C+S.%3B+Groothuis-Oudshoorn%2C+K.&rft.au=van+Buuren%2C+S.%3B+Groothuis-Oudshoorn%2C+K.&rft.date=2011&rft.volume=45&rft.issue=3&rft_id=info:doi\/10.18637%2Fjss.v045.i03&rfr_id=info:sid\/en.wikipedia.org:Journal:Characterizing_and_managing_missing_structured_data_in_electronic_health_records:_Data_analysis\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HelfandScreening08-16\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HelfandScreening08_16-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Helfand, M.; Carson, S. (2008). \"Screening for Lipid Disorders in Adults: Selective Update of 2001 US Preventive Services Task Force Review\". <i>Evidence Syntheses<\/i> <b>49<\/b>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/20722146\" target=\"_blank\">20722146<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Screening+for+Lipid+Disorders+in+Adults%3A+Selective+Update+of+2001+US+Preventive+Services+Task+Force+Review&rft.jtitle=Evidence+Syntheses&rft.aulast=Helfand%2C+M.%3B+Carson%2C+S.&rft.au=Helfand%2C+M.%3B+Carson%2C+S.&rft.date=2008&rft.volume=49&rft_id=info:pmid\/20722146&rfr_id=info:sid\/en.wikipedia.org:Journal:Characterizing_and_managing_missing_structured_data_in_electronic_health_records:_Data_analysis\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-McDonaldLOINC03-17\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-McDonaldLOINC03_17-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">McDonald, C.J.; Huff, S.M.; Suico, J.G. et al. (2003). \"LOINC, a universal standard for identifying laboratory observations: A 5-year update.\". <i>Clinical Chemistry<\/i> <b>49<\/b> (4): 624\u201333. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/12651816\" target=\"_blank\">12651816<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=LOINC%2C+a+universal+standard+for+identifying+laboratory+observations%3A+A+5-year+update.&rft.jtitle=Clinical+Chemistry&rft.aulast=McDonald%2C+C.J.%3B+Huff%2C+S.M.%3B+Suico%2C+J.G.+et+al.&rft.au=McDonald%2C+C.J.%3B+Huff%2C+S.M.%3B+Suico%2C+J.G.+et+al.&rft.date=2003&rft.volume=49&rft.issue=4&rft.pages=624%E2%80%9333&rft_id=info:pmid\/12651816&rfr_id=info:sid\/en.wikipedia.org:Journal:Characterizing_and_managing_missing_structured_data_in_electronic_health_records:_Data_analysis\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BerettaNearest16-18\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BerettaNearest16_18-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Beratta, L.; Santaniello, A. (2016). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4959387\" target=\"_blank\">\"Nearest neighbor imputation algorithms: A critical evaluation\"<\/a>. <i>BMC Medical Informatics and Decision Making<\/i> <b>16<\/b> (Suppl 3): 74. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1186%2Fs12911-016-0318-z\" target=\"_blank\">10.1186\/s12911-016-0318-z<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4959387\/\" target=\"_blank\">PMC4959387<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/27454392\" target=\"_blank\">27454392<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4959387\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4959387<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Nearest+neighbor+imputation+algorithms%3A+A+critical+evaluation&rft.jtitle=BMC+Medical+Informatics+and+Decision+Making&rft.aulast=Beratta%2C+L.%3B+Santaniello%2C+A.&rft.au=Beratta%2C+L.%3B+Santaniello%2C+A.&rft.date=2016&rft.volume=16&rft.issue=Suppl+3&rft.pages=74&rft_id=info:doi\/10.1186%2Fs12911-016-0318-z&rft_id=info:pmc\/PMC4959387&rft_id=info:pmid\/27454392&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4959387&rfr_id=info:sid\/en.wikipedia.org:Journal:Characterizing_and_managing_missing_structured_data_in_electronic_health_records:_Data_analysis\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-TroyanskayaMissing01-19\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-TroyanskayaMissing01_19-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Troyanskaya, O.; Cantor, M.; Sherlock, G. et al. (2001). \"Missing value estimation methods for DNA microarrays\". <i>Bioinformatics<\/i> <b>17<\/b> (6): 520\u20135. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/11395428\" target=\"_blank\">11395428<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Missing+value+estimation+methods+for+DNA+microarrays&rft.jtitle=Bioinformatics&rft.aulast=Troyanskaya%2C+O.%3B+Cantor%2C+M.%3B+Sherlock%2C+G.+et+al.&rft.au=Troyanskaya%2C+O.%3B+Cantor%2C+M.%3B+Sherlock%2C+G.+et+al.&rft.date=2001&rft.volume=17&rft.issue=6&rft.pages=520%E2%80%935&rft_id=info:pmid\/11395428&rfr_id=info:sid\/en.wikipedia.org:Journal:Characterizing_and_managing_missing_structured_data_in_electronic_health_records:_Data_analysis\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-PestovIsThe13-20\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-PestovIsThe13_20-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Pestov, V. (2013). \"Is the k-NN classifier in high dimensions affected by the curse of dimensionality?\". <i>Computers & Mathematics with Applications<\/i> <b>65<\/b> (10): 1427\u201337. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.camwa.2012.09.011\" target=\"_blank\">10.1016\/j.camwa.2012.09.011<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Is+the+k-NN+classifier+in+high+dimensions+affected+by+the+curse+of+dimensionality%3F&rft.jtitle=Computers+%26+Mathematics+with+Applications&rft.aulast=Pestov%2C+V.&rft.au=Pestov%2C+V.&rft.date=2013&rft.volume=65&rft.issue=10&rft.pages=1427%E2%80%9337&rft_id=info:doi\/10.1016%2Fj.camwa.2012.09.011&rfr_id=info:sid\/en.wikipedia.org:Journal:Characterizing_and_managing_missing_structured_data_in_electronic_health_records:_Data_analysis\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-StuartMultiple-21\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-StuartMultiple_21-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Stuart, E.A.; Azur, M.; Frangakis, C.; Leaf, P. (2009). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2727238\" target=\"_blank\">\"Multiple imputation with large data sets: A case study of the Children's Mental Health Initiative\"<\/a>. <i>American Journal of Epidemiology<\/i> <b>69<\/b> (9): 1133-9. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1093%2Faje%2Fkwp026\" target=\"_blank\">10.1093\/aje\/kwp026<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC2727238\/\" target=\"_blank\">PMC2727238<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/19318618\" target=\"_blank\">19318618<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2727238\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2727238<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Multiple+imputation+with+large+data+sets%3A+A+case+study+of+the+Children%27s+Mental+Health+Initiative&rft.jtitle=American+Journal+of+Epidemiology&rft.aulast=Stuart%2C+E.A.%3B+Azur%2C+M.%3B+Frangakis%2C+C.%3B+Leaf%2C+P.&rft.au=Stuart%2C+E.A.%3B+Azur%2C+M.%3B+Frangakis%2C+C.%3B+Leaf%2C+P.&rft.date=2009&rft.volume=69&rft.issue=9&rft.pages=1133-9&rft_id=info:doi\/10.1093%2Faje%2Fkwp026&rft_id=info:pmc\/PMC2727238&rft_id=info:pmid\/19318618&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC2727238&rfr_id=info:sid\/en.wikipedia.org:Journal:Characterizing_and_managing_missing_structured_data_in_electronic_health_records:_Data_analysis\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WhiteMultiple11-22\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WhiteMultiple11_22-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">White, I.R.; Royston, P.; Wood, A.M. (2011). \"Multiple imputation using chained equations: Issues and guidance for practice\". <i>Statistics in Medicine<\/i> <b>30<\/b> (4): 377-99. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1002%2Fsim.4067\" target=\"_blank\">10.1002\/sim.4067<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/21225900\" target=\"_blank\">21225900<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Multiple+imputation+using+chained+equations%3A+Issues+and+guidance+for+practice&rft.jtitle=Statistics+in+Medicine&rft.aulast=White%2C+I.R.%3B+Royston%2C+P.%3B+Wood%2C+A.M.&rft.au=White%2C+I.R.%3B+Royston%2C+P.%3B+Wood%2C+A.M.&rft.date=2011&rft.volume=30&rft.issue=4&rft.pages=377-99&rft_id=info:doi\/10.1002%2Fsim.4067&rft_id=info:pmid\/21225900&rfr_id=info:sid\/en.wikipedia.org:Journal:Characterizing_and_managing_missing_structured_data_in_electronic_health_records:_Data_analysis\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BodnerWhat08-23\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BodnerWhat08_23-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Bodner, T.E. (2008). \"What Improves with Increased Missing Data Imputations?\". <i>Structural Equation Modeling<\/i> <b>15<\/b> (4): 651\u201375. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1080%2F10705510802339072\" target=\"_blank\">10.1080\/10705510802339072<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=What+Improves+with+Increased+Missing+Data+Imputations%3F&rft.jtitle=Structural+Equation+Modeling&rft.aulast=Bodner%2C+T.E.&rft.au=Bodner%2C+T.E.&rft.date=2008&rft.volume=15&rft.issue=4&rft.pages=651%E2%80%9375&rft_id=info:doi\/10.1080%2F10705510802339072&rfr_id=info:sid\/en.wikipedia.org:Journal:Characterizing_and_managing_missing_structured_data_in_electronic_health_records:_Data_analysis\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CRANMiceAdds-24\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CRANMiceAdds_24-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Robitzsch, A.; Grund, S.; Henke, T.. <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/cran.r-project.org\/web\/packages\/miceadds\/index.html\" target=\"_blank\">\"miceadds: Some Additional Multiple Imputation Functions, Especially for 'mice'\"<\/a>. <i>The Comprehensive R Archive Network<\/i><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/cran.r-project.org\/web\/packages\/miceadds\/index.html\" target=\"_blank\">https:\/\/cran.r-project.org\/web\/packages\/miceadds\/index.html<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 15 February 2018<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=miceadds%3A+Some+Additional+Multiple+Imputation+Functions%2C+Especially+for+%27mice%27&rft.atitle=The+Comprehensive+R+Archive+Network&rft.aulast=Robitzsch%2C+A.%3B+Grund%2C+S.%3B+Henke%2C+T.&rft.au=Robitzsch%2C+A.%3B+Grund%2C+S.%3B+Henke%2C+T.&rft_id=https%3A%2F%2Fcran.r-project.org%2Fweb%2Fpackages%2Fmiceadds%2Findex.html&rfr_id=info:sid\/en.wikipedia.org:Journal:Characterizing_and_managing_missing_structured_data_in_electronic_health_records:_Data_analysis\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-H.C3.A9raud-BousquetPractical12-25\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-H.C3.A9raud-BousquetPractical12_25-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">H\u00e9raud-Bousquet, V.; Larsen, C.; Carpenter, J. et al. (2012). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3537570\" target=\"_blank\">\"Practical considerations for sensitivity analysis after multiple imputation applied to epidemiological studies with incomplete data\"<\/a>. <i>BMC Medical Research Methodology<\/i> <b>12<\/b>: 73. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1186%2F1471-2288-12-73\" target=\"_blank\">10.1186\/1471-2288-12-73<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3537570\/\" target=\"_blank\">PMC3537570<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/22681630\" target=\"_blank\">22681630<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3537570\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3537570<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Practical+considerations+for+sensitivity+analysis+after+multiple+imputation+applied+to+epidemiological+studies+with+incomplete+data&rft.jtitle=BMC+Medical+Research+Methodology&rft.aulast=H%C3%A9raud-Bousquet%2C+V.%3B+Larsen%2C+C.%3B+Carpenter%2C+J.+et+al.&rft.au=H%C3%A9raud-Bousquet%2C+V.%3B+Larsen%2C+C.%3B+Carpenter%2C+J.+et+al.&rft.date=2012&rft.volume=12&rft.pages=73&rft_id=info:doi\/10.1186%2F1471-2288-12-73&rft_id=info:pmc\/PMC3537570&rft_id=info:pmid\/22681630&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3537570&rfr_id=info:sid\/en.wikipedia.org:Journal:Characterizing_and_managing_missing_structured_data_in_electronic_health_records:_Data_analysis\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CarpenterSensit07-26\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CarpenterSensit07_26-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Carpenter, J.R.; Kenward, M.G.; White, I.R. (2007). \"Sensitivity analysis after multiple imputation under missing at random: a weighting approach\". <i>Statistical Methods in Medical Research<\/i> <b>16<\/b> (3): 259-75. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1177%2F0962280206075303\" target=\"_blank\">10.1177\/0962280206075303<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/17621471\" target=\"_blank\">17621471<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Sensitivity+analysis+after+multiple+imputation+under+missing+at+random%3A+a+weighting+approach&rft.jtitle=Statistical+Methods+in+Medical+Research&rft.aulast=Carpenter%2C+J.R.%3B+Kenward%2C+M.G.%3B+White%2C+I.R.&rft.au=Carpenter%2C+J.R.%3B+Kenward%2C+M.G.%3B+White%2C+I.R.&rft.date=2007&rft.volume=16&rft.issue=3&rft.pages=259-75&rft_id=info:doi\/10.1177%2F0962280206075303&rft_id=info:pmid\/17621471&rfr_id=info:sid\/en.wikipedia.org:Journal:Characterizing_and_managing_missing_structured_data_in_electronic_health_records:_Data_analysis\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<\/ol><\/div>\n<h2><span class=\"mw-headline\" id=\"Notes\">Notes<\/span><\/h2>\n<p>This presentation is faithful to the original, with only a few minor changes to grammar, spelling, and presentation, including the addition of PMCID and DOI when they were missing from the original reference.\n<\/p>\n<!-- \nNewPP limit report\nCached time: 20181214185734\nCache expiry: 86400\nDynamic content: false\nCPU time usage: 0.635 seconds\nReal time usage: 0.673 seconds\nPreprocessor visited node count: 20967\/1000000\nPreprocessor generated node count: 36004\/1000000\nPost\u2010expand include size: 158565\/2097152 bytes\nTemplate argument size: 48134\/2097152 bytes\nHighest expansion depth: 18\/40\nExpensive parser function count: 0\/100\n-->\n\n<!-- \nTransclusion expansion time report (%,ms,calls,template)\n100.00% 623.941 1 - -total\n 85.00% 530.375 1 - Template:Reflist\n 73.67% 459.685 26 - Template:Citation\/core\n 65.54% 408.957 21 - Template:Cite_journal\n 10.02% 62.549 1 - Template:Infobox_journal_article\n 9.63% 60.093 1 - Template:Infobox\n 9.42% 58.752 4 - Template:Cite_web\n 9.20% 57.426 42 - Template:Citation\/identifier\n 5.80% 36.165 80 - Template:Infobox\/row\n 3.94% 24.589 27 - Template:Citation\/make_link\n-->\n\n<!-- Saved in parser cache with key limswiki:pcache:idhash:10459-0!*!0!!en!5!* and timestamp 20181214185734 and revision id 32678\n -->\n<\/div><div class=\"printfooter\">Source: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Characterizing_and_managing_missing_structured_data_in_electronic_health_records:_Data_analysis\">https:\/\/www.limswiki.org\/index.php\/Journal:Characterizing_and_managing_missing_structured_data_in_electronic_health_records:_Data_analysis<\/a><\/div>\n\t\t\t\t\t\t\t\t\t\t<!-- end content -->\n\t\t\t\t\t\t\t\t\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t\t<!-- end of the left (by default at least) column -->\n\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t\t\n\t\t<\/div>\n\t\t\n\n<\/body>","8159b0ee46c6326792ce28d0e7506e33_images":["https:\/\/www.limswiki.org\/images\/6\/68\/Fig1_Beaulieu-JonesJMIRMedInfo2018_6-1.png","https:\/\/www.limswiki.org\/images\/3\/34\/Fig2_Beaulieu-JonesJMIRMedInfo2018_6-1.png","https:\/\/www.limswiki.org\/images\/f\/f3\/Fig3_Beaulieu-JonesJMIRMedInfo2018_6-1.png","https:\/\/www.limswiki.org\/images\/3\/32\/Fig4_Beaulieu-JonesJMIRMedInfo2018_6-1.png","https:\/\/www.limswiki.org\/images\/7\/78\/Fig5_Beaulieu-JonesJMIRMedInfo2018_6-1.png","https:\/\/www.limswiki.org\/images\/d\/df\/Fig6_Beaulieu-JonesJMIRMedInfo2018_6-1.png"],"8159b0ee46c6326792ce28d0e7506e33_timestamp":1544813854,"15471c0a609cecac0db384f57371da08_type":"article","15471c0a609cecac0db384f57371da08_title":"Developing a customized approach for strengthening tuberculosis laboratory quality management systems toward accreditation (Albert et al. 2017)","15471c0a609cecac0db384f57371da08_url":"https:\/\/www.limswiki.org\/index.php\/Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation","15471c0a609cecac0db384f57371da08_plaintext":"\n\n\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\n\n\t\t\t\tJournal:Developing a customized approach for strengthening tuberculosis laboratory quality management systems toward accreditation\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tFrom LIMSWiki\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\tJump to: navigation, search\n\n\t\t\t\t\t\n\t\t\t\t\tFull article title\n \nDeveloping a customized approach for strengthening tuberculosis laboratory quality management systems toward accreditationJournal\n \nAfrican Journal of Laboratory MedicineAuthor(s)\n \nAlbert, Heidi; Trollip, Andre; Erni, Donatelle; Kao, KekeletsoAuthor affiliation(s)\n \nFoundation for Innovative New DiagnosticsPrimary contact\n \nEmail: heidi dot albert at finddx dot orgYear published\n \n2017Volume and issue\n \n6 (2)Page(s)\n \na576DOI\n \n10.4102\/ajlm.v6i2.576ISSN\n \n2225-2010Distribution license\n \nCreative Commons Attribution 2.0 GenericWebsite\n \nhttp:\/\/www.ajlmonline.org\/index.php\/ajlm\/article\/view\/576\/826Download\n \nhttp:\/\/www.ajlmonline.org\/index.php\/ajlm\/article\/viewFile\/576\/816 (PDF)\n\nContents\n\n1 Abstract \n2 Introduction \n3 TB SLMTA development \n\n3.1 Customization of training materials \n3.2 TB SLMTA Harmonized Checklist \n3.3 Implementation of TB SLMTA \n3.4 Training-of-trainers workshop \n3.5 Models for implementation \n3.6 Improvement projects and mentoring \n\n\n4 Results from TB SLMTA implementation workshops \n5 Discussion \n\n5.1 Limitations \n5.2 Conclusions \n\n\n6 Acknowledgements \n\n6.1 Competing interests \n6.2 Sources of support \n6.3 Authors\u2019 contributions \n\n\n7 References \n8 Notes \n\n\n\nAbstract \nBackground: Quality-assured tuberculosis laboratory services are critical to achieve global and national goals for tuberculosis prevention and care. Implementation of a quality management system (QMS) in laboratories leads to improved quality of diagnostic tests and better patient care. The Strengthening Laboratory Management Toward Accreditation (SLMTA) program has led to measurable improvements in the QMS of clinical laboratories. However, progress in tuberculosis laboratories has been slower, which may be attributed to the need for a structured tuberculosis-specific approach to implementing QMS. We describe the development and early implementation of the Strengthening Tuberculosis Laboratory Management Toward Accreditation (TB SLMTA) program.\nDevelopment: The TB SLMTA curriculum was developed by customizing the SLMTA curriculum to include specific tools, job aids, and supplementary materials specific to the tuberculosis laboratory. The TB SLMTA Harmonized Checklist was developed from the World Health Organisation Regional Office for Africa Stepwise Laboratory Quality Improvement Process Towards Accreditation checklist and incorporated tuberculosis-specific requirements from the Global Laboratory Initiative Stepwise Process Towards Tuberculosis Laboratory Accreditation online tool.\nImplementation: Four regional training-of-trainers workshops have been conducted since 2013. The TB SLMTA program has been rolled out in 37 tuberculosis laboratories in 10 countries, using the workshop approach in 32 laboratories in five countries and the facility-based approach in five tuberculosis laboratories in five countries.\nConclusion: Lessons learned from early implementation of TB SLMTA suggest that a structured training and mentoring program can build a foundation towards further quality improvement in tuberculosis laboratories. Structured mentoring, and institutionalization of QMS into country programs, is needed to support tuberculosis laboratories to achieve accreditation.\n\nIntroduction \nThe World Health Organization\u2019s (WHO) End TB Strategy calls for an end to the global tuberculosis epidemic. It aims to reduce deaths by 95 percent and new tuberculosis cases by 90 percent, and also ensure that no family is burdened with catastrophic expenses due to tuberculosis by 2025.[1] Despite the fall in global tuberculosis mortality by 47 percent since 1990, the disease still claimed more than 1.5 million lives in 2014.[2] A cascade of events \u2014 including poor screening, failure to link screened patients to diagnostic services, and failure to link diagnosed patients to treatment \u2014 means that many people die from tuberculosis due to delayed diagnosis and treatment initiation.[3]\nQuality-assured laboratory services are critical for the provision of timely, accurate, and reliable results to support diagnosis, drug-resistance testing, treatment monitoring, and surveillance of disease. Weak laboratory systems result in high levels of laboratory error that impact patient care and undermine the confidence healthcare providers have in laboratory services.[4] In recent years, the focus on improving laboratory quality management systems (QMS), and assuring the quality of laboratory services by working toward national or international laboratory accreditation, has intensified.[5] Accreditation is the formal recognition of implementation of a QMS that adheres to international standards and has been shown to improve the quality of healthcare for patients through reduction in testing errors.[6]\nThe Strengthening Laboratory Management Toward Accreditation (SLMTA) program was developed by the United States Centers for Disease Control and Prevention in collaboration with the American Society for Clinical Pathology, the Clinton Health Access Initiative, and the WHO Regional Office for Africa to promote immediate and measurable quality improvement in laboratories in developing countries. SLMTA is a program that may be used to prepare laboratories for accreditation.[7] Since its launch in Kigali, Rwanda in 2009, SLMTA has been implemented in 47 countries (23 in Africa), with 617 laboratories already enrolled. Eighteen per cent of the enrolled laboratories are at the national level and most (98%) are providing HIV-related services.[8] Only four National Tuberculosis Reference Laboratories (NTRLs) in Africa have achieved international accreditation to date[9][10], and only six NTRLs have undergone a formal Stepwise Laboratory Quality Improvement Process Towards Accreditation (SLIPTA) audit by the African Society for Laboratory Medicine (T. Mekonen, personal communication). Accredited NTRLs are better equipped to support the national tuberculosis laboratory network and also provide reliable support to their national tuberculosis control and treatment programs.[11]\nSince 2007, the Foundation for Innovative New Diagnostics (FIND) has worked with Ministries of Health to introduce new diagnostic technologies to improve the diagnosis of tuberculosis, detection of drug resistance[12] and upgrading of facilities.[13][14][15][16] Although technical capacity to conduct new tests can be developed within a relatively short time frame, persistent challenges to providing quality results in a consistent manner often remain, many of which are linked to laboratory quality system weaknesses. In 2011, through funding from the United States President\u2019s Emergency Plan for AIDS Relief, FIND was involved in implementation of the SLMTA program in clinical laboratories in the Dominican Republic. Measurable improvement was observed in cohorts of laboratories participating in the program. However, tuberculosis laboratories were not included in this program. Concurrently, the Global Laboratory Initiative (GLI) was developing its Stepwise Process Towards Tuberculosis Laboratory Accreditation online tool.[17] This tool provided online resources and a framework consisting of four phases, but it did not have training materials or an implementation plan to enable adoption by tuberculosis laboratories. Tuberculosis laboratories, particularly at the central or regional-level, have separate facilities from other clinical laboratories. They have different requirements for biosafety and quality assurance, and they have often been excluded from accreditation efforts. Recognizing the unique needs of tuberculosis laboratories, FIND developed a comprehensive approach to tuberculosis laboratory strengthening based on the existing SLMTA approach and incorporating aspects of the GLI Stepwise Process Towards Tuberculosis Laboratory Accreditation online tool.\nIn this article, we describe the development of the Tuberculosis Strengthening Laboratory Management Toward Accreditation (TB SLMTA) program and the challenges experienced during early implementation in 10 countries. We also reflect on approaches that will ensure continued quality improvement to reach accreditation and institutionalization of the program.\n\nTB SLMTA development \nCustomization of training materials \nIn 2012, FIND conducted a review of the SLMTA materials and customized the content for tuberculosis laboratories based on available tuberculosis resources (either developed internally by FIND or by other organizations). This customization included the development of specific tools, job aids, and supplementary materials for the implementation of a QMS in the tuberculosis laboratory (Table 1), but it kept the overall structure of the SLMTA curriculum. Customization included major changes to the content of the SLMTA Facilities and Safety and Quality Assurance modules (the focus was changed from the quantitative testing in SLMTA to the qualitative and semi-quantitative testing relevant to the tuberculosis laboratory). The SLMTA Laboratory Testing and Test Result Reporting modules were combined and an Auditing module was introduced. Tuberculosis laboratory-specific tools, examples and scenarios were introduced throughout all modules in the training. The TB SLMTA Harmonized Checklist was also introduced as part of the program.\nThe TB SLMTA curriculum was piloted in Cape Town in April 2013 in a shortened Training-of-Trainers (TOT) Workshop led by SLMTA Master Trainers and with experienced tuberculosis laboratory specialists as participants. Following the pilot workshop, some changes were made to the training materials (e.g., organization and cross-referencing of tools, adjustment of training notes for clarity, and editing errors) and the TB SLMTA Harmonised Checklist was revised.\nSubsequent review and revision of the TB SLMTA curriculum has been conducted to keep the content current with an updated GLI tool (version 2.0, 2013) and WHO Regional Office for Africa SLIPTA (2015) tool. A review of the TB SLMTA curriculum was conducted in 2015 due to experience that improvement projects did not necessarily target the highest priority non-conformities. Based on feedback from previous training, minor changes were also made to the Cross-cutting, Facilities and Safety, and Quality Assurance modules.\n\r\n\n\n\n\n\n\n\n\n\n\n Table 1: Comparison of SLMTA and TB SLMTA program components\n\n\n\nTB SLMTA Harmonized Checklist \nThe TB SLMTA Harmonized Checklist[18] is based on the WHO Regional Office for Africa SLIPTA checklist (2007)[19], and incorporates tuberculosis laboratory-specific requirements as provided in the GLI Stepwise Process Towards Tuberculosis Laboratory Accreditation tool, which were inserted as sub-clauses in the SLIPTA checklist. The TB SLMTA Harmonized Checklist is used to assess the QMS of the tuberculosis laboratory prior to enrollment in the program (baseline assessment) and after program completion (exit assessment). The differences between the scores obtained overall, and for each section, are a measure of the impact of the program. Assessors evaluate the laboratory operations as per checklist items, scoring the assessment and documenting their findings in detail.\nThe pilot version of the TB SLMTA Harmonized Checklist[20] had additional scores allocated to the tuberculosis-specific clauses. A revised checklist (TB SLMTA Harmonized Checklist v1.0), which maintained the original SLIPTA scoring system[21], was used in the TB SLMTA roll-out. Recognition is given using a five-star grading system, with the following scores corresponding to the indicated number of stars: zero stars (0\u2013142 points; < 55%), one star (143\u2013165 points; 55\u201364%), two stars (166\u2013191 points; 65%\u201374%), three stars (192\u2013217 points; 75%\u201384%), four stars (218\u2013243 points; 85%\u201394%) and five stars (244\u2013258 points; \u2265 95%).\nThe TB SLMTA Harmonized Checklist 1.0 was recently revised in keeping with SLIPTA v2:2015, and the additional clauses of International Organization for Standardization 15189:2012. The questions added pertain to risk assessment, laboratory information systems, contingency planning, and safety. The TB SLMTA Harmonized Checklist v1.0 is available in English and Spanish. The TB SLMTA Harmonized Checklist v2.1 is available in English and Russian.[16]\n\nImplementation of TB SLMTA \nImplementation of the TB SLMTA program starts with the initial engagement with the Ministry of Health on the program scope and expected outputs, as well as commitments required from the country (Figure 1). During this planning phase, the country selects the participating tuberculosis laboratories, the model of implementation, the trainees to attend the TOT, and the TB SLMTA participants who will attend the in-country training. Countries selects two or three participants per laboratory to attend the in-country TB SLMTA training. Typically, participants include the laboratory manager, quality officer, and one technician. After graduation from the TOT, the certified trainers implement the program in the country. Baseline and exit assessments are conducted with the TB SLMTA Harmonized Checklist v1.0 by trainers or SLIPTA-trained assessors with tuberculosis laboratory experience. In-country national or regional training is conducted over a period of 12\u201315 months. Between training sessions, participants work on improvement projects supervised by the TB SLMTA mentors. Post-TB SLMTA activities are conducted in the laboratories under supervision of the mentors before an external assessment determines the readiness for accreditation.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 1: Diagrammatic representation of the TB SLMTA program from initiation to accreditation\n\n\n\nTraining-of-trainers workshop \nThe TB SLMTA TOTs are conducted by SLMTA Master Trainers and are based on teach-back methodology.[22] This practice-based training approach requires trainees to play the roles of both trainer and participant as they teach the curriculum at the same time as they are learning the content. The TOTs provide trainees with an introduction to the TB SLMTA materials, practice in delivering the content and receiving feedback on their performance. The ratio of trainees to Master Trainers is a maximum of eight to one. To certify as trainers, trainees must demonstrate knowledge of TB SLMTA curriculum and proficiency in delivering training. Trainees that find teach-back challenging and do not show a good understanding of the materials graduate as one-one coaches. They can facilitate rollout in their laboratory but are not certified to train others.\nMentors are trainers who support the in-country training participants during the implementation phases between training sessions. During mentoring visits to the laboratory, they supervise the participants as they implement the improvement projects and provide resources (e.g., standard operating procedures) to implement what was taught in the training in the tuberculosis laboratory. The fundamentals of mentoring are modeled during the TOT. Trainees who are certified as trainers and who show an aptitude for mentoring are selected by the Master Trainers to perform mentoring in their countries. Mentoring in TB SLMTA builds on the relationship established between trainer and participant, and seeks to support program implementation in the laboratory. Master trainers support the certified trainers and mentors during their first national or regional training and where possible provide at least one interim visit to support mentoring. Trainers under supervision receive additional support from the Master Trainers during the workshop and, if assessed as proficient, can then graduate as trainers.\nThe TOTs are intensive and highly interactive, hence good language skills and a working knowledge of QMS concepts is required. Based on this observation and challenges experienced in conducting a TOT with participants with varying levels of English fluency, a mandatory online training was introduced prior to the TOT, based on the WHO's Laboratory Quality Management System: Handbook[23], to ensure that trainees have a basic understanding of QMS principles. In addition, trainees whose first language is not English are required to successfully complete an online language competency training before registration for the TOT.\n\nModels for implementation \nTwo models have been adopted for implementation of TB SLMTA:\n\nWorkshop approach: Where several tuberculosis laboratories are available in-country, or in cases where more than one country conducts centralized training, the three-workshop approach can be used. Three five-day regional workshops are conducted by trainers, approximately three months apart.\n Facility-based approach: Where there is only one tuberculosis laboratory in the country being enrolled in TB SLMTA, the facility-based approach may be used. The facility-based approach follows the same TB SLMTA curriculum, with training sessions split into three blocks over 12 months.\nFactors affecting choice of implementation model include funding, number of laboratories participating in TB SLMTA, and availability of staff. TB SLMTA is targeted for implementation in tuberculosis laboratories at the national or referral level. These laboratories are conducting advanced tuberculosis testing and generally have separate facilities from general laboratories. Laboratories conducting tuberculosis testing on lower levels of the healthcare system are not targeted with this training session. However, this does not preclude the use of TB SLMTA resources to guide them, especially those related to safety and quality assurance.\n\nImprovement projects and mentoring \nImprovement projects are broad-based activities that address weaknesses in the QMS. Topics for improvement projects are chosen from subjects covered in training. As with SLMTA (Table 1), each participant is required to complete two improvement projects between training sessions. The \"just do it\" project (e.g., maintaining personnel files) is implemented as a group by all the participants from the laboratory. The \"complex\" project, which requires extensive planning and before-and-after data collection, is chosen with assistance from the certified trainers. Ideally, laboratory management is included in the decision of the topic and scope (if laboratory managers are not participants) to ensure management engagement and allocation of time and resources to complete the projects. The projects are implemented by the participants but should involve the entire laboratory staff. Participants present their findings at national or regional workshops or on a day set aside by the laboratory (facility-based approach).\nFIND found that often the choice of improvement projects does not reflect the priority gaps of the laboratory. In 2015, FIND adopted a more stringent criterion for improvement project selection. Under the guidance of certified trainers, each participant completes two improvement projects between training sessions; both are \"complex\" and require extensive planning and data collection. The first project is based on the subjects covered in the training. For example, Training 1 (Quality indicators and Facilities and safety), Training 2 (Equipment, Purchasing and inventory, and Quality assurance) and Training 3 (Documents and records, Client Management and Customer Service, and Specimen management) (Table 2). The second project addresses the weaknesses identified during the baseline assessment. These non-conformities are split between the participants, and a different section of the TB SLMTA Harmonized Checklist is covered between training sessions.\nTB SLMTA uses a short-term mentoring model instead of the embedded model encouraged by SLMTA. Mentoring visits are conducted by the trainers over two or three days. Each facility receives two visits between each workshop. The outcomes of the mentoring visits and, in particular, the progress with improvement projects, is monitored by the mentors for each laboratory, and any necessary support is provided. Standardized data collection tools are used to record the findings of mentor visits.\n\r\n\n\n\n\n\n\n\n\n\n\n Table 2: Examples and types of improvement projects implemented in the TB SLMTA program\n\n\n\nResults from TB SLMTA implementation workshops \nSince 2013, four regional TOTs have been conducted in Lesotho, Vietnam, South Africa, and Moldova. Seventy trainees from 27 counties have been trained, and 59 are certified as trainers (including trainers under supervision), of which four participants are from WHO Supranational Reference Laboratories that provide tuberculosis laboratory technical support to countries. Twenty-six trainers are currently active in the TB SLMTA program. Currently there are three Master Trainers. One Master Trainer, based in the African region, graduated after conducting a round of TB SLMTA, and we expect two more graduates in the coming year (one in the African region and one in South East Asia) for a total of six Master Trainers.\nThe TB SLMTA program has been rolled out in 37 tuberculosis laboratories in 10 countries (Figure 2). National or regional TB SLMTA training using the workshop approach were conducted in 32 laboratories in five countries (Dominican Republic, Ethiopia, Lesotho, Tanzania, and Vietnam). The facility-based approach has been used in one regional tuberculosis laboratory in Cameroon. The instructional phase is complete in these laboratories but is ongoing in the four NTRLs in Eastern Europe (Armenia, Azerbaijan, Belarus, and Moldova).\n\r\n\n\n\n\n\n\n\n\n\n\n Table 2: Implementation of TB SLMTA in 37 tuberculosis laboratories in 10 countries since 2013\n\n\n\nBaseline and exit assessment scores for 18 laboratories in four countries (Cameroon, Ethiopia, Lesotho, and Tanzania) were available for analysis and are summarized in Table 3. At baseline, six of the 18 laboratories had a zero-star rating, three had a one-star rating, seven had a two-star rating and two laboratories had a four-star rating. No laboratories had three- or five-star ratings at baseline assessment. At exit, two laboratories remained at zero stars, two were rated at one-star, four laboratories were rated at two stars, seven were rated at three-stars and three laboratories were rated at four-stars. The impact of TB SLMTA, as well as the individual country experiences will be addressed in separate publications.\n\r\n\n\n\n\n\n\n\n\n\n\n Table 3: \"Stars\" at baseline and exit for 18 tuberculosis laboratories in four countries completing the TB SLMTA program (2013\u20132016); results based on TB SLMTA Harmonized Checklist baseline and exit scores\n\n\n\nFIND developed an online biosafety training program in 2014[24], and TB SLMTA participants in Tanzania and Lesotho were enrolled in this training to complement the basic biosafety module of the TB SLMTA programme. This task-based online training was implemented in conjunction with biosafety improvements projects following Workshop 1.\nActive participation for this extended time of the in-country training is a challenge for trainers and participants alike. In our cohort, 21 participants (Lesotho, one; Dominican Republic, eight; Ethiopia, seven; Tanzania, three; Vietnam, two) were unable to complete the compulsory training and improvement projects due to personal or job-related reasons. Although in most cases additional participants from the same laboratory meant that the laboratory was not excluded from continuing the program, one regional tuberculosis laboratory in Tanzania was not able to complete the program as both participants were unable to finish the training.\n\nDiscussion \nTuberculosis laboratories are an essential element of tuberculosis prevention and care, providing testing for diagnosis, surveillance, and treatment monitoring that can be accessible at all levels of the healthcare system. The TB SLMTA program provides tuberculosis laboratories with customized support to accelerate the process of strengthening their QMS towards accreditation. There is an urgent need to expand the program, as only 21 NTRLs (43%) on the African continent have received SLMTA training, and only four NTRLs have reached accreditation. Although 44% of NTRLs report implementing a QMS, the extent of implementation is not known.[25]\nThere were a number of challenges to implementing the TB SLMTA program in the initial cohort of laboratories. The lack of experienced assessors was a challenge in some countries. SLIPTA-trained assessors with experience in tuberculosis testing were used to supplement certified TB SLMTA trainers. However, limited hands-on time spent with the TB SLMTA Harmonized Checklist during the TB SLMTA TOT, and SLIPTA trained assessors who are unfamiliar with implementing the tuberculosis laboratory specific clauses, may lead to inflated scoring during these assessments. While laboratories enrolled in the TB SLMTA program may use the WHO Regional Office for Africa SLIPTA checklist, the additional components from GLI included in the TB SLMTA Harmonized Checklist v1.0 enable technical assessment alongside assessment of International Organization for Standardization (ISO) components.\nIn instances where management had not been fully engaged in the TB SLMTA implementation, participants struggled to complete the improvement projects. It is therefore critical to actively engage upper management, both at the facility level and at the national Ministry of Health, to ensure their commitment to the program. Institutionalization of the QMS into country programs will be needed to support tuberculosis laboratories in achieving accreditation. Training and quality improvement activities may be seen as extra workload, especially in settings where staff shortages and high workload are existing challenges. Furthermore, trainers and mentors, who were critical components of the program, are required to support the program in addition to their usual duties. This may put additional strain on the laboratory, as other staff are required to cover their workstations during their absence.\nIn addition to senior-level engagement of the Ministry of Health, QMS activities being conducted by various implementing partners and donors should be coordinated centrally to ensure synergy to avoid duplication of effort and the risk of confusion and wastage of resources. We found multiple partners conducting overlapping activities related to QMS without clear coordination to ensure cost-efficiency and maximum impact from available resources. Partners should seek active collaboration on QMS activities, harmonization of approaches, and contributions of various groups, under the leadership and coordination of the Ministry of Health.\nThe TOTs are highly interactive, and some trainees whose first language is not English find the training challenging. Introduction of language proficiency and an introduction to QMS online training in 2014 helped ensure that trainees in the TOTs were successfully certified as trainers. However, this approach limits potential trainees. In 2016, FIND conducted a TOT in English, with real-time Russian translation (using a tuberculosis laboratory specialist as translator). All the trainees passed, suggesting that the model can be expanded to non-English speaking countries using translated materials (including the TB SLMTA Harmonized Checklist) and real-time translation. Careful considerations must be given to the translator, with preference given to those who have an insight into laboratory testing or QMS. Further analysis of this approach is required. Master Trainers are certified after successful supervision of the roll-out of the TB SLMTA program in a country. To facilitate the expansion of the TB SLMTA program, there is a need for more Master Trainers, particularly those that can train in languages other than English.\nAs noted earlier, FIND recently adopted a more stringent criterion for improvement project selection. A focus on the weaknesses identified in baseline assessment, in particular quality indicator and quality control monitoring and safety in the tuberculosis laboratory, has the potential to improve the impact of the TB SLMTA program. As the cohort of tuberculosis laboratories that have used this strategy increases, the impact will be measured.\nMentoring of laboratories was found to be an important component to successful implementation of SLMTA. Embedded mentoring has proven to result in measurable improvement in the QMS in many countries, including Lesotho, Zimbabwe, Kenya, and Nigeria.[26][27][28][29] In TB SLMTA, certified trainers mentor participants during site visits and remotely between workshops. This short-term mentoring model is cost-effective, scalable, and sustainable, and it is well suited to the workshop approach of implementation used in our cohort. Ongoing structured mentoring of the tuberculosis laboratories that obtained four-star ratings at TB SLMTA exit assessment is being conducted in preparation for accreditation. The TB SLMTA program is currently focused on tuberculosis laboratories with the capacity to perform advanced diagnostics such as culture and drug susceptibility testing. Tuberculosis laboratories on the lower level of the healthcare system may consider integration into current SLMTA activities. In addition, if feasible, countries should consider sharing mentoring and assessments between programs. These cost-cutting approaches have an added benefit of integrating services and present opportunities for knowledge sharing and will encourage sustainability and institutionalization of the QMS.\n\nLimitations \nThis study is subject to a number of limitations. Firstly, none of the TB SLMTA laboratories have reached accreditation yet, and we are thus reporting on intermediate measures of quality improvement leading to the ultimate target of accreditation. Second, quality improvement from three stars to five stars (which is considered equivalent to accreditation readiness) is challenging.[30] Third, the role of mentors in this final phase is still to be determined. Finally, in this article we have not addressed the costs of TB SLMTA. A cost estimation exercise is being undertaken. We do not expect the costs to differ substantially from costs of the SLMTA program as reported by others.[31]\n\nConclusions \nTB SLMTA is a structured training and mentoring program that is customized to meet the needs of tuberculosis laboratories implementing a QMS in resource-limited settings within a reasonably short time frame, building a foundation toward further quality improvement toward achieving accreditation. Expansion of this program is an urgent priority to address the need for accreditation of tuberculosis laboratories on the African continent and beyond.\n\nAcknowledgements \nThe findings and conclusions in this publication are those of the authors and do not necessarily represent the official position of the CDC.\n\nCompeting interests \nThe authors declare that they have no financial or personal relationship(s) that may have inappropriately influenced them in writing this article.\n\nSources of support \nWe are grateful to the United States President\u2019s Emergency Plan for AIDS Relief through the United States Centers for Disease Control and Prevention (3U2GPS002746), ExpandTB, UNITAID, UK Aid, Aus Aid, and the WHO for funding support.\n\nAuthors\u2019 contributions \nH.A., A.T. and K.K. contributed to development and implementation of the program, data analysis, and preparation and critical review of the manuscript. D.E. contributed to the data analysis. All authors agreed with the content of the manuscript.\n\nReferences \n\n\n\u2191 World Health Organization (2015). \"WHO End TB Strategy: Global strategy and targets for tuberculosis prevention, care and control after 2015\". http:\/\/who.int\/tb\/post2015_strategy\/en\/ . Retrieved 22 August 2016 .   \n\n\u2191 World Health Organization (2015). \"Global Tuberculosis Report 2015\" (PDF). ISBN 9789241565059. http:\/\/apps.who.int\/iris\/bitstream\/10665\/191102\/1\/9789241565059_eng.pdf . Retrieved 22 August 2016 .   \n\n\u2191 Kuznetsov, V.N.; Grjibovski, A.M.; Mariandyshev, A.O. et al. (2014). \"Two vicious circles contributing to a diagnostic delay for tuberculosis patients in Arkhangelsk\". Emerging Health Threats Journal 7: 24909. doi:10.3402\/ehtj.v7.24909. PMC PMC4147085. PMID 25163673. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4147085 .   \n\n\u2191 Alemnji, G.A.; Zeh, C.; Yao, K.; Fonjungo, P.N. (2014). \"Strengthening national health laboratories in sub-Saharan Africa: a decade of remarkable progress\". Tropical Medicine & International Health 19 (4): 450-8. doi:10.1111\/tmi.12269. PMC PMC4826025. PMID 24506521. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4826025 .   \n\n\u2191 Gershy-Damet, G.M.; Rotz, P.; Cross, D. et al. (2010). \"The World Health Organization African region laboratory accreditation process: Improving the quality of laboratory systems in the African region\". American Journal of Clinical Pathology 134 (3): 393-400. doi:10.1309\/AJCPTUUC2V1WJQBM. PMID 20716795.   \n\n\u2191 Peter, T.F.; Rotz, P.D.; Blair, D.H. et al. (2010). \"Impact of laboratory accreditation on patient care and the health system\". American Journal of Clinical Pathology 134 (4): 550-5. doi:10.1309\/AJCPH1SKQ1HNWGHF. PMID 20855635.   \n\n\u2191 Yao, K.; McKinney, B.; Murphy, A. et al. (2010). \"Improving quality management systems of laboratories in developing countries: An innovative training approach to accelerate laboratory accreditation\". American Journal of Clinical Pathology 134 (3): 401\u20139. doi:10.1309\/AJCPNBBL53FWUIQJ. PMID 20716796.   \n\n\u2191 Yao, K.; Luman, E.T.; SLMTA Collaborating Authors (2014). \"Evidence from 617 laboratories in 47 countries for SLMTA-driven improvement in quality management systems\". African Journal of Laboratory Medicine 3 (3): 262. doi:10.4102\/ajlm.v3i2.262. PMC PMC4706175. PMID 26753132. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4706175 .   \n\n\u2191 \"Directory of Accredited Facilities\". South African National Accreditation System. 2015. http:\/\/home.sanas.co.za\/?page_id=38 . Retrieved 22 August 2016 .   \n\n\u2191 \"Direct\u00f3rio de Entidades Acreditradas\". Instituto Portugu\u00eas de Acredita\u00e7\u00e3o. http:\/\/www.ipac.pt\/pesquisa\/acredita.asp . Retrieved 19 January 2016 .   \n\n\u2191 Ridderhof, J.C.; van Deun, A.; Kam, K.M. et al. (2007). \"Roles of laboratories and laboratory systems in effective tuberculosis programmes\". Bulletin of the World Health Organization 85 (5): 354-9. PMC PMC2636656. PMID 17639219. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2636656 .   \n\n\u2191 Raizada, N.; Sachdeva, K.S.; Chauhan, D.S. et al. (2014). \"A multi-site validation in India of the line probe assay for the rapid diagnosis of multi-drug resistant tuberculosis directly from sputum specimens\". PLoS One 9 (2): e88626. doi:10.1371\/journal.pone.0088626. PMC PMC3929364. PMID 24586360. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3929364 .   \n\n\u2191 Raizada, N.; Sachdeva, K.S.; Sreenivas, A. et al. (2014). \"Feasibility of decentralised deployment of Xpert MTB\/RIF test at lower level of health system in India\". PLoS One 9 (2): e89301. doi:10.1371\/journal.pone.0089301. PMC PMC3935858. PMID 24586675. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3935858 .   \n\n\u2191 Albert, H.; Bwanga, F.; Mukkada, S. et al. (2010). \"Rapid screening of MDR-TB using molecular Line Probe Assay is feasible in Uganda\". BMC Infectious Diseases 10: 41. doi:10.1186\/1471-2334-10-41. PMC PMC2841659. PMID 20187922. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2841659 .   \n\n\u2191 Albert, H.; Manabe, Y.; Lukyamuzi, G. et al. (2010). \"Performance of three LED-based fluorescence microscopy systems for detection of tuberculosis in Uganda\". PLoS One 5 (12): e15206. doi:10.1371\/journal.pone.0015206. PMC PMC3011008. PMID 21203398. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3011008 .   \n\n\u2191 16.0 16.1 Paramasivan, C.N.; Lee, E.; Kao, K. et al. (2010). \"Experience establishing tuberculosis laboratory capacity in a developing country setting\". International Journal of Tuberculosis and Lung Disease 13 (1): 59-64. PMID 20003696.   \n\n\u2191 \"GLI Stepwise Process towards TB Laboratory Accreditation\". Global Laboratory Initiative. http:\/\/www.gliquality.org\/ . Retrieved 22 August 2016 .   \n\n\u2191 \"TB Laboratory Quality Management Systems Towards Accreditation Harmonized Checklist\" (PDF). FIND. February 2016. https:\/\/www.finddx.org\/wp-content\/uploads\/2016\/07\/NEW-TB-Harmonized-Checklist-v2.1-2-2016.pdf . Retrieved 19 January 2017 .   \n\n\u2191 \"WHO AFRO SLIPTA Checklist\". African Society for Laboratory Medicine. 2007. http:\/\/www.afro.who.int\/en\/downloads\/cat_view\/1501-english\/787-blood-safety.html . Retrieved 19 January 2017 .   \n\n\u2191 Maruta, T.; Albert, H.; Hove, P. et al. (2012). \"Harmonizing quality improvement of TB laboratories with generic accreditation initiatives\". ASLM Conference Proceedings. http:\/\/citeweb.info\/20121448643 .   \n\n\u2191 Albert, H. (2014). \"Strengthening laboratory management toward accreditation programme: Transforming the lab landscape in developing countries and customisation for labs\". Proceedings of the 45th Union World Conference on Lung Health. http:\/\/html5.slideonline.eu\/event\/14UNION .   \n\n\u2191 Maruta, T.; Yao, K.; Ndlovu, N. et al. (2014). \"Training-of-trainers: A strategy to build country capacity for SLMTA expansion and sustainability\". African Journal of Laboratory Medicine 3 (2): 196. doi:10.4102\/ajlm.v3i2.196. PMC PMC4703333. PMID 26753131. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4703333 .   \n\n\u2191 World Health Organization (2011). \"Laboratory Quality Management System: Handbook\". ISBN 9789241548274. http:\/\/www.who.int\/ihr\/publications\/lqms\/en\/ . Retrieved 22 August 2016 .   \n\n\u2191 Foundation for Innovative New Diagnostics. \"FIND Online Training\". http:\/\/finddiagnostics-training.org\/moodle\/ . Retrieved 22 August 2016 .   \n\n\u2191 Albert, H.; de Dieu Iragena, J.; Kao, K. et al. (2017). \"Implementation of quality management systems and progress towards accreditation of National Tuberculosis Reference Laboratories in Africa\". African Journal of Laboratory Medicine 6 (2): 490. doi:10.4102\/ajlm.v6i2.490. PMC PMC5523922. PMID 28879161. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5523922 .   \n\n\u2191 Maruta, T.; Motenbang, D.; Mathabo, L. et al. (2012). \"Impact of mentorship on WHO-AFRO Strengthening Laboratory Quality Improvement Process Towards Accreditation (SLIPTA)\". African Journal of Laboratory Medicine 1 (1): 6. doi:10.4102\/ajlm.v1i1.6. PMC PMC5644515. PMID 29062726. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5644515 .   \n\n\u2191 Nzombe, P.; Luman, E.T.; Shumba, E. et al. (2014). \"Maximising mentorship: Variations in laboratory mentorship models implemented in Zimbabwe\". African Journal of Laboratory Medicine 3 (2): 241. doi:10.4102\/ajlm.v3i2.241. PMC PMC5637805. PMID 29043196. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5637805 .   \n\n\u2191 Makokha, E.P.; Mwalili, S.; Basiye, F.L. et al. (2014). \"Using standard and institutional mentorship models to implement SLMTA in Kenya\". African Journal of Laboratory Medicine 3 (2): 220. doi:10.4102\/ajlm.v3i2.220. PMC PMC5637804. PMID 29043191. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5637804 .   \n\n\u2191 Maruta, T.; Rotz, P.; Peter, T. et al. (2013). \"Setting up a structured laboratory mentoring programme\". African Journal of Laboratory Medicine 2 (1): 77. doi:10.4102\/ajlm.v2i1.77. PMC PMC5637775. PMID 29043168. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5637775 .   \n\n\u2191 Ndihokubwayo, J.B.; Maruta, T.; Ndlovu, N. et al. (2016). \"Implementation of the World Health Organization Regional Office for Africa Stepwise Laboratory Quality Improvement Process Towards Accreditation\". African Journal of Laboratory Medicine 5 (1): 280. doi:10.4102\/ajlm.v5i1.280. PMC PMC5436392. PMID 28879103. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5436392 .   \n\n\u2191 Shumba, E.; Nzombe, P.; Mbinda, A. et al. (2014). \"Weighing the costs: Implementing the SLMTA programme in Zimbabwe using internal versus external facilitators\". African Journal of Laboratory Medicine 3 (2): 248. doi:10.4102\/ajlm.v3i2.248. PMC PMC5637799. PMID 29043197. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5637799 .   \n\n\nNotes \nThis presentation is faithful to the original, with only a few minor changes to grammar, spelling, and presentation, including the addition of PMCID and DOI when they were missing from the original reference. Reference 19 is a dead URL, and an archived version could not be located. Reference 21 can't be found at the author-supplied URL.\n\n\n\n\n\n\nSource: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation\">https:\/\/www.limswiki.org\/index.php\/Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation<\/a>\n\t\t\t\t\tCategories: LIMSwiki journal articles (added in 2018)LIMSwiki journal articles (all)LIMSwiki journal articles on quality management\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\n\t\t\n\t\t\tNavigation menu\n\t\t\t\t\t\n\t\t\tViews\n\n\t\t\t\n\t\t\t\t\n\t\t\t\tJournal\n\t\t\t\tDiscussion\n\t\t\t\tView source\n\t\t\t\tHistory\n\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\t\n\t\t\t\tPersonal tools\n\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\t\t\tLog in\n\t\t\t\t\t\t\t\t\t\t\t\t\tRequest account\n\t\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\t\n\t\tNavigation\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tMain page\n\t\t\t\t\t\t\t\t\t\t\tRecent changes\n\t\t\t\t\t\t\t\t\t\t\tRandom page\n\t\t\t\t\t\t\t\t\t\t\tHelp\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tSearch\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t \n\t\t\t\t\t\t\n\t\t\t\t\n\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tTools\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tWhat links here\n\t\t\t\t\t\t\t\t\t\t\tRelated changes\n\t\t\t\t\t\t\t\t\t\t\tSpecial pages\n\t\t\t\t\t\t\t\t\t\t\tPermanent link\n\t\t\t\t\t\t\t\t\t\t\tPage information\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\tPrint\/export\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tCreate a book\n\t\t\t\t\t\t\t\t\t\t\tDownload as PDF\n\t\t\t\t\t\t\t\t\t\t\tDownload as Plain text\n\t\t\t\t\t\t\t\t\t\t\tPrintable version\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\n\t\tSponsors\n\t\t\n\t\t\t \r\n\n\t\r\n\n\t\r\n\n\t\r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n\t\n\t\r\n\n\t\r\n\n\t\r\n\n\t\r\n\t\t\n\t\t\n\t\t\t\n\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t This page was last modified on 23 January 2018, at 20:26.\n\t\t\t\t\t\t\t\t\tThis page has been accessed 789 times.\n\t\t\t\t\t\t\t\t\tContent is available under a Creative Commons Attribution-ShareAlike 4.0 International License unless otherwise noted.\n\t\t\t\t\t\t\t\t\tPrivacy policy\n\t\t\t\t\t\t\t\t\tAbout LIMSWiki\n\t\t\t\t\t\t\t\t\tDisclaimers\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\t\n\n","15471c0a609cecac0db384f57371da08_html":"<body class=\"mediawiki ltr sitedir-ltr ns-206 ns-subject page-Journal_Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation skin-monobook action-view\">\n<div id=\"rdp-ebb-globalWrapper\">\n\t\t<div id=\"rdp-ebb-column-content\">\n\t\t\t<div id=\"rdp-ebb-content\" class=\"mw-body\" role=\"main\">\n\t\t\t\t<a id=\"rdp-ebb-top\"><\/a>\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t<h1 id=\"rdp-ebb-firstHeading\" class=\"firstHeading\" lang=\"en\">Journal:Developing a customized approach for strengthening tuberculosis laboratory quality management systems toward accreditation<\/h1>\n\t\t\t\t\n\t\t\t\t<div id=\"rdp-ebb-bodyContent\" class=\"mw-body-content\">\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\n\t\t\t\t\t<!-- start content -->\n\t\t\t\t\t<div id=\"rdp-ebb-mw-content-text\" lang=\"en\" dir=\"ltr\" class=\"mw-content-ltr\">\n\n\n<h2><span class=\"mw-headline\" id=\"Abstract\">Abstract<\/span><\/h2>\n<p><b>Background<\/b>: Quality-assured tuberculosis <a href=\"https:\/\/www.limswiki.org\/index.php\/Laboratory\" title=\"Laboratory\" target=\"_blank\" class=\"wiki-link\" data-key=\"c57fc5aac9e4abf31dccae81df664c33\">laboratory<\/a> services are critical to achieve global and national goals for tuberculosis prevention and care. Implementation of a <a href=\"https:\/\/www.limswiki.org\/index.php\/Quality_management_system\" title=\"Quality management system\" target=\"_blank\" class=\"wiki-link\" data-key=\"dfecf3cd6f18d4a5e9ac49ca360b447d\">quality management system<\/a> (QMS) in laboratories leads to improved quality of diagnostic tests and better patient care. The Strengthening Laboratory Management Toward Accreditation (SLMTA) program has led to measurable improvements in the QMS of <a href=\"https:\/\/www.limswiki.org\/index.php\/Clinical_laboratory\" title=\"Clinical laboratory\" target=\"_blank\" class=\"wiki-link\" data-key=\"307bcdf1bdbcd1bb167cee435b7a5463\">clinical laboratories<\/a>. However, progress in tuberculosis laboratories has been slower, which may be attributed to the need for a structured tuberculosis-specific approach to implementing QMS. We describe the development and early implementation of the Strengthening Tuberculosis Laboratory Management Toward Accreditation (TB SLMTA) program.\n<\/p><p><b>Development<\/b>: The TB SLMTA curriculum was developed by customizing the SLMTA curriculum to include specific tools, job aids, and supplementary materials specific to the tuberculosis laboratory. The TB SLMTA Harmonized Checklist was developed from the World Health Organisation Regional Office for Africa Stepwise Laboratory Quality Improvement Process Towards Accreditation checklist and incorporated tuberculosis-specific requirements from the Global Laboratory Initiative Stepwise Process Towards Tuberculosis Laboratory Accreditation online tool.\n<\/p><p><b>Implementation<\/b>: Four regional training-of-trainers workshops have been conducted since 2013. The TB SLMTA program has been rolled out in 37 tuberculosis laboratories in 10 countries, using the workshop approach in 32 laboratories in five countries and the facility-based approach in five tuberculosis laboratories in five countries.\n<\/p><p><b>Conclusion<\/b>: Lessons learned from early implementation of TB SLMTA suggest that a structured training and mentoring program can build a foundation towards further quality improvement in tuberculosis laboratories. Structured mentoring, and institutionalization of QMS into country programs, is needed to support tuberculosis laboratories to achieve accreditation.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Introduction\">Introduction<\/span><\/h2>\n<p>The World Health Organization\u2019s (WHO) End TB Strategy calls for an end to the global tuberculosis epidemic. It aims to reduce deaths by 95 percent and new tuberculosis cases by 90 percent, and also ensure that no family is burdened with catastrophic expenses due to tuberculosis by 2025.<sup id=\"rdp-ebb-cite_ref-WHOGlobal15_1-0\" class=\"reference\"><a href=\"#cite_note-WHOGlobal15-1\" rel=\"external_link\">[1]<\/a><\/sup> Despite the fall in global tuberculosis mortality by 47 percent since 1990, the disease still claimed more than 1.5 million lives in 2014.<sup id=\"rdp-ebb-cite_ref-WHOGlobalTub15_2-0\" class=\"reference\"><a href=\"#cite_note-WHOGlobalTub15-2\" rel=\"external_link\">[2]<\/a><\/sup> A cascade of events \u2014 including poor screening, failure to link screened patients to diagnostic services, and failure to link diagnosed patients to treatment \u2014 means that many people die from tuberculosis due to delayed diagnosis and treatment initiation.<sup id=\"rdp-ebb-cite_ref-KuznetsovTwo14_3-0\" class=\"reference\"><a href=\"#cite_note-KuznetsovTwo14-3\" rel=\"external_link\">[3]<\/a><\/sup>\n<\/p><p>Quality-assured laboratory services are critical for the provision of timely, accurate, and reliable results to support diagnosis, drug-resistance testing, treatment monitoring, and surveillance of disease. Weak laboratory systems result in high levels of laboratory error that impact patient care and undermine the confidence healthcare providers have in laboratory services.<sup id=\"rdp-ebb-cite_ref-AlemnjiStrengthen14_4-0\" class=\"reference\"><a href=\"#cite_note-AlemnjiStrengthen14-4\" rel=\"external_link\">[4]<\/a><\/sup> In recent years, the focus on improving laboratory quality management systems (QMS), and assuring the quality of laboratory services by working toward national or international laboratory accreditation, has intensified.<sup id=\"rdp-ebb-cite_ref-Gershy-DametTheWorld10_5-0\" class=\"reference\"><a href=\"#cite_note-Gershy-DametTheWorld10-5\" rel=\"external_link\">[5]<\/a><\/sup> Accreditation is the formal recognition of implementation of a QMS that adheres to international standards and has been shown to improve the quality of healthcare for patients through reduction in testing errors.<sup id=\"rdp-ebb-cite_ref-PeterImpact10_6-0\" class=\"reference\"><a href=\"#cite_note-PeterImpact10-6\" rel=\"external_link\">[6]<\/a><\/sup>\n<\/p><p>The Strengthening Laboratory Management Toward Accreditation (SLMTA) program was developed by the United States <a href=\"https:\/\/www.limswiki.org\/index.php\/Centers_for_Disease_Control_and_Prevention\" title=\"Centers for Disease Control and Prevention\" target=\"_blank\" class=\"wiki-link\" data-key=\"176aa9c9513251c328d864d1e724e814\">Centers for Disease Control and Prevention<\/a> in collaboration with the <a href=\"https:\/\/www.limswiki.org\/index.php\/American_Society_for_Clinical_Pathology\" title=\"American Society for Clinical Pathology\" target=\"_blank\" class=\"wiki-link\" data-key=\"ed64f8785e5a87d9739326d346bd4c13\">American Society for Clinical Pathology<\/a>, the Clinton Health Access Initiative, and the WHO Regional Office for Africa to promote immediate and measurable quality improvement in laboratories in developing countries. SLMTA is a program that may be used to prepare laboratories for accreditation.<sup id=\"rdp-ebb-cite_ref-YaoImproving10_7-0\" class=\"reference\"><a href=\"#cite_note-YaoImproving10-7\" rel=\"external_link\">[7]<\/a><\/sup> Since its launch in Kigali, Rwanda in 2009, SLMTA has been implemented in 47 countries (23 in Africa), with 617 laboratories already enrolled. Eighteen per cent of the enrolled laboratories are at the national level and most (98%) are providing HIV-related services.<sup id=\"rdp-ebb-cite_ref-YaoEvidence14_8-0\" class=\"reference\"><a href=\"#cite_note-YaoEvidence14-8\" rel=\"external_link\">[8]<\/a><\/sup> Only four National Tuberculosis Reference Laboratories (NTRLs) in Africa have achieved international accreditation to date<sup id=\"rdp-ebb-cite_ref-SANASDirectory15_9-0\" class=\"reference\"><a href=\"#cite_note-SANASDirectory15-9\" rel=\"external_link\">[9]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-IPACDirectorio_10-0\" class=\"reference\"><a href=\"#cite_note-IPACDirectorio-10\" rel=\"external_link\">[10]<\/a><\/sup>, and only six NTRLs have undergone a formal Stepwise Laboratory Quality Improvement Process Towards Accreditation (SLIPTA) audit by the African Society for Laboratory Medicine (T. Mekonen, personal communication). Accredited NTRLs are better equipped to support the national tuberculosis laboratory network and also provide reliable support to their national tuberculosis control and treatment programs.<sup id=\"rdp-ebb-cite_ref-RidderhofRoles07_11-0\" class=\"reference\"><a href=\"#cite_note-RidderhofRoles07-11\" rel=\"external_link\">[11]<\/a><\/sup>\n<\/p><p>Since 2007, the Foundation for Innovative New Diagnostics (FIND) has worked with Ministries of Health to introduce new diagnostic technologies to improve the diagnosis of tuberculosis, detection of drug resistance<sup id=\"rdp-ebb-cite_ref-RaizadaAMulti14_12-0\" class=\"reference\"><a href=\"#cite_note-RaizadaAMulti14-12\" rel=\"external_link\">[12]<\/a><\/sup> and upgrading of facilities.<sup id=\"rdp-ebb-cite_ref-RaizadaFeasib14_13-0\" class=\"reference\"><a href=\"#cite_note-RaizadaFeasib14-13\" rel=\"external_link\">[13]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-AlbertRapid10_14-0\" class=\"reference\"><a href=\"#cite_note-AlbertRapid10-14\" rel=\"external_link\">[14]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-AlbertPerform10_15-0\" class=\"reference\"><a href=\"#cite_note-AlbertPerform10-15\" rel=\"external_link\">[15]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-ParamasivanExperience10_16-0\" class=\"reference\"><a href=\"#cite_note-ParamasivanExperience10-16\" rel=\"external_link\">[16]<\/a><\/sup> Although technical capacity to conduct new tests can be developed within a relatively short time frame, persistent challenges to providing quality results in a consistent manner often remain, many of which are linked to laboratory quality system weaknesses. In 2011, through funding from the United States President\u2019s Emergency Plan for AIDS Relief, FIND was involved in implementation of the SLMTA program in clinical laboratories in the Dominican Republic. Measurable improvement was observed in cohorts of laboratories participating in the program. However, tuberculosis laboratories were not included in this program. Concurrently, the Global Laboratory Initiative (GLI) was developing its Stepwise Process Towards Tuberculosis Laboratory Accreditation online tool.<sup id=\"rdp-ebb-cite_ref-GLIStepwise_17-0\" class=\"reference\"><a href=\"#cite_note-GLIStepwise-17\" rel=\"external_link\">[17]<\/a><\/sup> This tool provided online resources and a framework consisting of four phases, but it did not have training materials or an implementation plan to enable adoption by tuberculosis laboratories. Tuberculosis laboratories, particularly at the central or regional-level, have separate facilities from other clinical laboratories. They have different requirements for biosafety and quality assurance, and they have often been excluded from accreditation efforts. Recognizing the unique needs of tuberculosis laboratories, FIND developed a comprehensive approach to tuberculosis laboratory strengthening based on the existing SLMTA approach and incorporating aspects of the GLI Stepwise Process Towards Tuberculosis Laboratory Accreditation online tool.\n<\/p><p>In this article, we describe the development of the Tuberculosis Strengthening Laboratory Management Toward Accreditation (TB SLMTA) program and the challenges experienced during early implementation in 10 countries. We also reflect on approaches that will ensure continued quality improvement to reach accreditation and institutionalization of the program.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"TB_SLMTA_development\">TB SLMTA development<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"Customization_of_training_materials\">Customization of training materials<\/span><\/h3>\n<p>In 2012, FIND conducted a review of the SLMTA materials and customized the content for tuberculosis laboratories based on available tuberculosis resources (either developed internally by FIND or by other organizations). This customization included the development of specific tools, job aids, and supplementary materials for the implementation of a QMS in the tuberculosis laboratory (Table 1), but it kept the overall structure of the SLMTA curriculum. Customization included major changes to the content of the SLMTA <i>Facilities and Safety<\/i> and <i>Quality Assurance<\/i> modules (the focus was changed from the quantitative testing in SLMTA to the qualitative and semi-quantitative testing relevant to the tuberculosis laboratory). The SLMTA <i>Laboratory Testing<\/i> and <i>Test Result Reporting<\/i> modules were combined and an <i>Auditing<\/i> module was introduced. Tuberculosis laboratory-specific tools, examples and scenarios were introduced throughout all modules in the training. The TB SLMTA Harmonized Checklist was also introduced as part of the program.\n<\/p><p>The TB SLMTA curriculum was piloted in Cape Town in April 2013 in a shortened Training-of-Trainers (TOT) Workshop led by SLMTA Master Trainers and with experienced tuberculosis laboratory specialists as participants. Following the pilot workshop, some changes were made to the training materials (e.g., organization and cross-referencing of tools, adjustment of training notes for clarity, and editing errors) and the TB SLMTA Harmonised Checklist was revised.\n<\/p><p>Subsequent review and revision of the TB SLMTA curriculum has been conducted to keep the content current with an updated GLI tool (version 2.0, 2013) and WHO Regional Office for Africa SLIPTA (2015) tool. A review of the TB SLMTA curriculum was conducted in 2015 due to experience that improvement projects did not necessarily target the highest priority non-conformities. Based on feedback from previous training, minor changes were also made to the <i>Cross-cutting<\/i>, <i>Facilities and Safety<\/i>, and <i>Quality Assurance<\/i> modules.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Tab1_Albert_AfricanJofLabMed2017_6-2.jpg\" class=\"image wiki-link\" target=\"_blank\" data-key=\"885787be9aee7b3a8d361abedd3c0d19\"><img alt=\"Tab1 Albert AfricanJofLabMed2017 6-2.jpg\" src=\"https:\/\/www.limswiki.org\/images\/8\/8a\/Tab1_Albert_AfricanJofLabMed2017_6-2.jpg\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Table 1:<\/b> Comparison of SLMTA and TB SLMTA program components<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h3><span class=\"mw-headline\" id=\"TB_SLMTA_Harmonized_Checklist\">TB SLMTA Harmonized Checklist<\/span><\/h3>\n<p>The TB SLMTA Harmonized Checklist<sup id=\"rdp-ebb-cite_ref-FIND_TBLab16_18-0\" class=\"reference\"><a href=\"#cite_note-FIND_TBLab16-18\" rel=\"external_link\">[18]<\/a><\/sup> is based on the WHO Regional Office for Africa SLIPTA checklist (2007)<sup id=\"rdp-ebb-cite_ref-WHOAFRO-SLIPTA_19-0\" class=\"reference\"><a href=\"#cite_note-WHOAFRO-SLIPTA-19\" rel=\"external_link\">[19]<\/a><\/sup>, and incorporates tuberculosis laboratory-specific requirements as provided in the GLI Stepwise Process Towards Tuberculosis Laboratory Accreditation tool, which were inserted as sub-clauses in the SLIPTA checklist. The TB SLMTA Harmonized Checklist is used to assess the QMS of the tuberculosis laboratory prior to enrollment in the program (baseline assessment) and after program completion (exit assessment). The differences between the scores obtained overall, and for each section, are a measure of the impact of the program. Assessors evaluate the laboratory operations as per checklist items, scoring the assessment and documenting their findings in detail.\n<\/p><p>The pilot version of the TB SLMTA Harmonized Checklist<sup id=\"rdp-ebb-cite_ref-MarutaHarmonizing12_20-0\" class=\"reference\"><a href=\"#cite_note-MarutaHarmonizing12-20\" rel=\"external_link\">[20]<\/a><\/sup> had additional scores allocated to the tuberculosis-specific clauses. A revised checklist (TB SLMTA Harmonized Checklist v1.0), which maintained the original SLIPTA scoring system<sup id=\"rdp-ebb-cite_ref-AlbertStrengthen14_21-0\" class=\"reference\"><a href=\"#cite_note-AlbertStrengthen14-21\" rel=\"external_link\">[21]<\/a><\/sup>, was used in the TB SLMTA roll-out. Recognition is given using a five-star grading system, with the following scores corresponding to the indicated number of stars: zero stars (0\u2013142 points; < 55%), one star (143\u2013165 points; 55\u201364%), two stars (166\u2013191 points; 65%\u201374%), three stars (192\u2013217 points; 75%\u201384%), four stars (218\u2013243 points; 85%\u201394%) and five stars (244\u2013258 points; \u2265 95%).\n<\/p><p>The TB SLMTA Harmonized Checklist 1.0 was recently revised in keeping with SLIPTA v2:2015, and the additional clauses of International Organization for Standardization 15189:2012. The questions added pertain to risk assessment, <a href=\"https:\/\/www.limswiki.org\/index.php\/Laboratory_information_system\" title=\"Laboratory information system\" target=\"_blank\" class=\"wiki-link\" data-key=\"37add65b4d1c678b382a7d4817a9cf64\">laboratory information systems<\/a>, contingency planning, and safety. The TB SLMTA Harmonized Checklist v1.0 is available in English and Spanish. The TB SLMTA Harmonized Checklist v2.1 is available in English and Russian.<sup id=\"rdp-ebb-cite_ref-ParamasivanExperience10_16-1\" class=\"reference\"><a href=\"#cite_note-ParamasivanExperience10-16\" rel=\"external_link\">[16]<\/a><\/sup>\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Implementation_of_TB_SLMTA\">Implementation of TB SLMTA<\/span><\/h3>\n<p>Implementation of the TB SLMTA program starts with the initial engagement with the Ministry of Health on the program scope and expected outputs, as well as commitments required from the country (Figure 1). During this planning phase, the country selects the participating tuberculosis laboratories, the model of implementation, the trainees to attend the TOT, and the TB SLMTA participants who will attend the in-country training. Countries selects two or three participants per laboratory to attend the in-country TB SLMTA training. Typically, participants include the laboratory manager, quality officer, and one technician. After graduation from the TOT, the certified trainers implement the program in the country. Baseline and exit assessments are conducted with the TB SLMTA Harmonized Checklist v1.0 by trainers or SLIPTA-trained assessors with tuberculosis laboratory experience. In-country national or regional training is conducted over a period of 12\u201315 months. Between training sessions, participants work on improvement projects supervised by the TB SLMTA mentors. Post-TB SLMTA activities are conducted in the laboratories under supervision of the mentors before an external assessment determines the readiness for accreditation.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig1_Albert_AfricanJofLabMed2017_6-2.jpg\" class=\"image wiki-link\" target=\"_blank\" data-key=\"c947ab9da72b06a4bd04eef409a7636e\"><img alt=\"Fig1 Albert AfricanJofLabMed2017 6-2.jpg\" src=\"https:\/\/www.limswiki.org\/images\/2\/28\/Fig1_Albert_AfricanJofLabMed2017_6-2.jpg\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 1:<\/b> Diagrammatic representation of the TB SLMTA program from initiation to accreditation<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h3><span class=\"mw-headline\" id=\"Training-of-trainers_workshop\">Training-of-trainers workshop<\/span><\/h3>\n<p>The TB SLMTA TOTs are conducted by SLMTA Master Trainers and are based on teach-back methodology.<sup id=\"rdp-ebb-cite_ref-MarutaTraining14_22-0\" class=\"reference\"><a href=\"#cite_note-MarutaTraining14-22\" rel=\"external_link\">[22]<\/a><\/sup> This practice-based training approach requires trainees to play the roles of both trainer and participant as they teach the curriculum at the same time as they are learning the content. The TOTs provide trainees with an introduction to the TB SLMTA materials, practice in delivering the content and receiving feedback on their performance. The ratio of trainees to Master Trainers is a maximum of eight to one. To certify as trainers, trainees must demonstrate knowledge of TB SLMTA curriculum and proficiency in delivering training. Trainees that find teach-back challenging and do not show a good understanding of the materials graduate as one-one coaches. They can facilitate rollout in their laboratory but are not certified to train others.\n<\/p><p>Mentors are trainers who support the in-country training participants during the implementation phases between training sessions. During mentoring visits to the laboratory, they supervise the participants as they implement the improvement projects and provide resources (e.g., standard operating procedures) to implement what was taught in the training in the tuberculosis laboratory. The fundamentals of mentoring are modeled during the TOT. Trainees who are certified as trainers and who show an aptitude for mentoring are selected by the Master Trainers to perform mentoring in their countries. Mentoring in TB SLMTA builds on the relationship established between trainer and participant, and seeks to support program implementation in the laboratory. Master trainers support the certified trainers and mentors during their first national or regional training and where possible provide at least one interim visit to support mentoring. Trainers under supervision receive additional support from the Master Trainers during the workshop and, if assessed as proficient, can then graduate as trainers.\n<\/p><p>The TOTs are intensive and highly interactive, hence good language skills and a working knowledge of QMS concepts is required. Based on this observation and challenges experienced in conducting a TOT with participants with varying levels of English fluency, a mandatory online training was introduced prior to the TOT, based on the WHO's <i>Laboratory Quality Management System: Handbook<\/i><sup id=\"rdp-ebb-cite_ref-WHOLab11_23-0\" class=\"reference\"><a href=\"#cite_note-WHOLab11-23\" rel=\"external_link\">[23]<\/a><\/sup>, to ensure that trainees have a basic understanding of QMS principles. In addition, trainees whose first language is not English are required to successfully complete an online language competency training before registration for the TOT.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Models_for_implementation\">Models for implementation<\/span><\/h3>\n<p>Two models have been adopted for implementation of TB SLMTA:\n<\/p>\n<ol><li><i>Workshop approach<\/i>: Where several tuberculosis laboratories are available in-country, or in cases where more than one country conducts centralized training, the three-workshop approach can be used. Three five-day regional workshops are conducted by trainers, approximately three months apart.<\/li>\n<li> <i>Facility-based approach<\/i>: Where there is only one tuberculosis laboratory in the country being enrolled in TB SLMTA, the facility-based approach may be used. The facility-based approach follows the same TB SLMTA curriculum, with training sessions split into three blocks over 12 months.<\/li><\/ol>\n<p>Factors affecting choice of implementation model include funding, number of laboratories participating in TB SLMTA, and availability of staff. TB SLMTA is targeted for implementation in tuberculosis laboratories at the national or referral level. These laboratories are conducting advanced tuberculosis testing and generally have separate facilities from general laboratories. Laboratories conducting tuberculosis testing on lower levels of the healthcare system are not targeted with this training session. However, this does not preclude the use of TB SLMTA resources to guide them, especially those related to safety and quality assurance.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Improvement_projects_and_mentoring\">Improvement projects and mentoring<\/span><\/h3>\n<p>Improvement projects are broad-based activities that address weaknesses in the QMS. Topics for improvement projects are chosen from subjects covered in training. As with SLMTA (Table 1), each participant is required to complete two improvement projects between training sessions. The \"just do it\" project (e.g., maintaining personnel files) is implemented as a group by all the participants from the laboratory. The \"complex\" project, which requires extensive planning and before-and-after data collection, is chosen with assistance from the certified trainers. Ideally, laboratory management is included in the decision of the topic and scope (if laboratory managers are not participants) to ensure management engagement and allocation of time and resources to complete the projects. The projects are implemented by the participants but should involve the entire laboratory staff. Participants present their findings at national or regional workshops or on a day set aside by the laboratory (facility-based approach).\n<\/p><p>FIND found that often the choice of improvement projects does not reflect the priority gaps of the laboratory. In 2015, FIND adopted a more stringent criterion for improvement project selection. Under the guidance of certified trainers, each participant completes two improvement projects between training sessions; both are \"complex\" and require extensive planning and data collection. The first project is based on the subjects covered in the training. For example, Training 1 (<i>Quality indicators<\/i> and <i>Facilities and safety<\/i>), Training 2 (<i>Equipment<\/i>, <i>Purchasing and inventory<\/i>, and <i>Quality assurance<\/i>) and Training 3 (<i>Documents and records<\/i>, <i>Client Management and Customer Service<\/i>, and <i>Specimen management<\/i>) (Table 2). The second project addresses the weaknesses identified during the baseline assessment. These non-conformities are split between the participants, and a different section of the TB SLMTA Harmonized Checklist is covered between training sessions.\n<\/p><p>TB SLMTA uses a short-term mentoring model instead of the embedded model encouraged by SLMTA. Mentoring visits are conducted by the trainers over two or three days. Each facility receives two visits between each workshop. The outcomes of the mentoring visits and, in particular, the progress with improvement projects, is monitored by the mentors for each laboratory, and any necessary support is provided. Standardized data collection tools are used to record the findings of mentor visits.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Tab2_Albert_AfricanJofLabMed2017_6-2.jpg\" class=\"image wiki-link\" target=\"_blank\" data-key=\"305ef79c1782395c284087a0d09e0341\"><img alt=\"Tab2 Albert AfricanJofLabMed2017 6-2.jpg\" src=\"https:\/\/www.limswiki.org\/images\/1\/15\/Tab2_Albert_AfricanJofLabMed2017_6-2.jpg\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Table 2:<\/b> Examples and types of improvement projects implemented in the TB SLMTA program<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h2><span class=\"mw-headline\" id=\"Results_from_TB_SLMTA_implementation_workshops\">Results from TB SLMTA implementation workshops<\/span><\/h2>\n<p>Since 2013, four regional TOTs have been conducted in Lesotho, Vietnam, South Africa, and Moldova. Seventy trainees from 27 counties have been trained, and 59 are certified as trainers (including trainers under supervision), of which four participants are from WHO Supranational Reference Laboratories that provide tuberculosis laboratory technical support to countries. Twenty-six trainers are currently active in the TB SLMTA program. Currently there are three Master Trainers. One Master Trainer, based in the African region, graduated after conducting a round of TB SLMTA, and we expect two more graduates in the coming year (one in the African region and one in South East Asia) for a total of six Master Trainers.\n<\/p><p>The TB SLMTA program has been rolled out in 37 tuberculosis laboratories in 10 countries (Figure 2). National or regional TB SLMTA training using the workshop approach were conducted in 32 laboratories in five countries (Dominican Republic, Ethiopia, Lesotho, Tanzania, and Vietnam). The facility-based approach has been used in one regional tuberculosis laboratory in Cameroon. The instructional phase is complete in these laboratories but is ongoing in the four NTRLs in Eastern Europe (Armenia, Azerbaijan, Belarus, and Moldova).\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig2_Albert_AfricanJofLabMed2017_6-2.jpg\" class=\"image wiki-link\" target=\"_blank\" data-key=\"19054c66147ceb2550db9870d0154b4a\"><img alt=\"Fig2 Albert AfricanJofLabMed2017 6-2.jpg\" src=\"https:\/\/www.limswiki.org\/images\/b\/b0\/Fig2_Albert_AfricanJofLabMed2017_6-2.jpg\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Table 2:<\/b> Implementation of TB SLMTA in 37 tuberculosis laboratories in 10 countries since 2013<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>Baseline and exit assessment scores for 18 laboratories in four countries (Cameroon, Ethiopia, Lesotho, and Tanzania) were available for analysis and are summarized in Table 3. At baseline, six of the 18 laboratories had a zero-star rating, three had a one-star rating, seven had a two-star rating and two laboratories had a four-star rating. No laboratories had three- or five-star ratings at baseline assessment. At exit, two laboratories remained at zero stars, two were rated at one-star, four laboratories were rated at two stars, seven were rated at three-stars and three laboratories were rated at four-stars. The impact of TB SLMTA, as well as the individual country experiences will be addressed in separate publications.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Tab3_Albert_AfricanJofLabMed2017_6-2.jpg\" class=\"image wiki-link\" target=\"_blank\" data-key=\"2a70d9908d7799cedb0c15c94888f8b5\"><img alt=\"Tab3 Albert AfricanJofLabMed2017 6-2.jpg\" src=\"https:\/\/www.limswiki.org\/images\/5\/5f\/Tab3_Albert_AfricanJofLabMed2017_6-2.jpg\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Table 3:<\/b> \"Stars\" at baseline and exit for 18 tuberculosis laboratories in four countries completing the TB SLMTA program (2013\u20132016); results based on TB SLMTA Harmonized Checklist baseline and exit scores<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>FIND developed an online biosafety training program in 2014<sup id=\"rdp-ebb-cite_ref-FINDOnlineTraining_24-0\" class=\"reference\"><a href=\"#cite_note-FINDOnlineTraining-24\" rel=\"external_link\">[24]<\/a><\/sup>, and TB SLMTA participants in Tanzania and Lesotho were enrolled in this training to complement the basic biosafety module of the TB SLMTA programme. This task-based online training was implemented in conjunction with biosafety improvements projects following Workshop 1.\n<\/p><p>Active participation for this extended time of the in-country training is a challenge for trainers and participants alike. In our cohort, 21 participants (Lesotho, one; Dominican Republic, eight; Ethiopia, seven; Tanzania, three; Vietnam, two) were unable to complete the compulsory training and improvement projects due to personal or job-related reasons. Although in most cases additional participants from the same laboratory meant that the laboratory was not excluded from continuing the program, one regional tuberculosis laboratory in Tanzania was not able to complete the program as both participants were unable to finish the training.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Discussion\">Discussion<\/span><\/h2>\n<p>Tuberculosis laboratories are an essential element of tuberculosis prevention and care, providing testing for diagnosis, surveillance, and treatment monitoring that can be accessible at all levels of the healthcare system. The TB SLMTA program provides tuberculosis laboratories with customized support to accelerate the process of strengthening their QMS towards accreditation. There is an urgent need to expand the program, as only 21 NTRLs (43%) on the African continent have received SLMTA training, and only four NTRLs have reached accreditation. Although 44% of NTRLs report implementing a QMS, the extent of implementation is not known.<sup id=\"rdp-ebb-cite_ref-AlbertImplement17_25-0\" class=\"reference\"><a href=\"#cite_note-AlbertImplement17-25\" rel=\"external_link\">[25]<\/a><\/sup>\n<\/p><p>There were a number of challenges to implementing the TB SLMTA program in the initial cohort of laboratories. The lack of experienced assessors was a challenge in some countries. SLIPTA-trained assessors with experience in tuberculosis testing were used to supplement certified TB SLMTA trainers. However, limited hands-on time spent with the TB SLMTA Harmonized Checklist during the TB SLMTA TOT, and SLIPTA trained assessors who are unfamiliar with implementing the tuberculosis laboratory specific clauses, may lead to inflated scoring during these assessments. While laboratories enrolled in the TB SLMTA program may use the WHO Regional Office for Africa SLIPTA checklist, the additional components from GLI included in the TB SLMTA Harmonized Checklist v1.0 enable technical assessment alongside assessment of <a href=\"https:\/\/www.limswiki.org\/index.php\/International_Organization_for_Standardization\" title=\"International Organization for Standardization\" target=\"_blank\" class=\"wiki-link\" data-key=\"116defc5d89c8a55f5b7c1be0790b442\">International Organization for Standardization<\/a> (ISO) components.\n<\/p><p>In instances where management had not been fully engaged in the TB SLMTA implementation, participants struggled to complete the improvement projects. It is therefore critical to actively engage upper management, both at the facility level and at the national Ministry of Health, to ensure their commitment to the program. Institutionalization of the QMS into country programs will be needed to support tuberculosis laboratories in achieving accreditation. Training and quality improvement activities may be seen as extra workload, especially in settings where staff shortages and high workload are existing challenges. Furthermore, trainers and mentors, who were critical components of the program, are required to support the program in addition to their usual duties. This may put additional strain on the laboratory, as other staff are required to cover their workstations during their absence.\n<\/p><p>In addition to senior-level engagement of the Ministry of Health, QMS activities being conducted by various implementing partners and donors should be coordinated centrally to ensure synergy to avoid duplication of effort and the risk of confusion and wastage of resources. We found multiple partners conducting overlapping activities related to QMS without clear coordination to ensure cost-efficiency and maximum impact from available resources. Partners should seek active collaboration on QMS activities, harmonization of approaches, and contributions of various groups, under the leadership and coordination of the Ministry of Health.\n<\/p><p>The TOTs are highly interactive, and some trainees whose first language is not English find the training challenging. Introduction of language proficiency and an introduction to QMS online training in 2014 helped ensure that trainees in the TOTs were successfully certified as trainers. However, this approach limits potential trainees. In 2016, FIND conducted a TOT in English, with real-time Russian translation (using a tuberculosis laboratory specialist as translator). All the trainees passed, suggesting that the model can be expanded to non-English speaking countries using translated materials (including the TB SLMTA Harmonized Checklist) and real-time translation. Careful considerations must be given to the translator, with preference given to those who have an insight into laboratory testing or QMS. Further analysis of this approach is required. Master Trainers are certified after successful supervision of the roll-out of the TB SLMTA program in a country. To facilitate the expansion of the TB SLMTA program, there is a need for more Master Trainers, particularly those that can train in languages other than English.\n<\/p><p>As noted earlier, FIND recently adopted a more stringent criterion for improvement project selection. A focus on the weaknesses identified in baseline assessment, in particular quality indicator and quality control monitoring and safety in the tuberculosis laboratory, has the potential to improve the impact of the TB SLMTA program. As the cohort of tuberculosis laboratories that have used this strategy increases, the impact will be measured.\n<\/p><p>Mentoring of laboratories was found to be an important component to successful implementation of SLMTA. Embedded mentoring has proven to result in measurable improvement in the QMS in many countries, including Lesotho, Zimbabwe, Kenya, and Nigeria.<sup id=\"rdp-ebb-cite_ref-MarutaImpact12_26-0\" class=\"reference\"><a href=\"#cite_note-MarutaImpact12-26\" rel=\"external_link\">[26]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-NzombeMaximising14_27-0\" class=\"reference\"><a href=\"#cite_note-NzombeMaximising14-27\" rel=\"external_link\">[27]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-MakokhaUsing14_28-0\" class=\"reference\"><a href=\"#cite_note-MakokhaUsing14-28\" rel=\"external_link\">[28]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-MarutaSetting13_29-0\" class=\"reference\"><a href=\"#cite_note-MarutaSetting13-29\" rel=\"external_link\">[29]<\/a><\/sup> In TB SLMTA, certified trainers mentor participants during site visits and remotely between workshops. This short-term mentoring model is cost-effective, scalable, and sustainable, and it is well suited to the workshop approach of implementation used in our cohort. Ongoing structured mentoring of the tuberculosis laboratories that obtained four-star ratings at TB SLMTA exit assessment is being conducted in preparation for accreditation. The TB SLMTA program is currently focused on tuberculosis laboratories with the capacity to perform advanced diagnostics such as culture and drug susceptibility testing. Tuberculosis laboratories on the lower level of the healthcare system may consider integration into current SLMTA activities. In addition, if feasible, countries should consider sharing mentoring and assessments between programs. These cost-cutting approaches have an added benefit of integrating services and present opportunities for knowledge sharing and will encourage sustainability and institutionalization of the QMS.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Limitations\">Limitations<\/span><\/h3>\n<p>This study is subject to a number of limitations. Firstly, none of the TB SLMTA laboratories have reached accreditation yet, and we are thus reporting on intermediate measures of quality improvement leading to the ultimate target of accreditation. Second, quality improvement from three stars to five stars (which is considered equivalent to accreditation readiness) is challenging.<sup id=\"rdp-ebb-cite_ref-NdihokubwayoImplement16_30-0\" class=\"reference\"><a href=\"#cite_note-NdihokubwayoImplement16-30\" rel=\"external_link\">[30]<\/a><\/sup> Third, the role of mentors in this final phase is still to be determined. Finally, in this article we have not addressed the costs of TB SLMTA. A cost estimation exercise is being undertaken. We do not expect the costs to differ substantially from costs of the SLMTA program as reported by others.<sup id=\"rdp-ebb-cite_ref-ShumbaWeighing14_31-0\" class=\"reference\"><a href=\"#cite_note-ShumbaWeighing14-31\" rel=\"external_link\">[31]<\/a><\/sup>\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Conclusions\">Conclusions<\/span><\/h3>\n<p>TB SLMTA is a structured training and mentoring program that is customized to meet the needs of tuberculosis laboratories implementing a QMS in resource-limited settings within a reasonably short time frame, building a foundation toward further quality improvement toward achieving accreditation. Expansion of this program is an urgent priority to address the need for accreditation of tuberculosis laboratories on the African continent and beyond.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Acknowledgements\">Acknowledgements<\/span><\/h2>\n<p>The findings and conclusions in this publication are those of the authors and do not necessarily represent the official position of the CDC.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Competing_interests\">Competing interests<\/span><\/h3>\n<p>The authors declare that they have no financial or personal relationship(s) that may have inappropriately influenced them in writing this article.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Sources_of_support\">Sources of support<\/span><\/h3>\n<p>We are grateful to the United States President\u2019s Emergency Plan for AIDS Relief through the United States Centers for Disease Control and Prevention (3U2GPS002746), ExpandTB, UNITAID, UK Aid, Aus Aid, and the WHO for funding support.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Authors.E2.80.99_contributions\">Authors\u2019 contributions<\/span><\/h3>\n<p>H.A., A.T. and K.K. contributed to development and implementation of the program, data analysis, and preparation and critical review of the manuscript. D.E. contributed to the data analysis. All authors agreed with the content of the manuscript.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"References\">References<\/span><\/h2>\n<div class=\"reflist references-column-width\" style=\"-moz-column-width: 30em; -webkit-column-width: 30em; column-width: 30em; list-style-type: decimal;\">\n<ol class=\"references\">\n<li id=\"cite_note-WHOGlobal15-1\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WHOGlobal15_1-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">World Health Organization (2015). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/who.int\/tb\/post2015_strategy\/en\/\" target=\"_blank\">\"WHO End TB Strategy: Global strategy and targets for tuberculosis prevention, care and control after 2015\"<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/who.int\/tb\/post2015_strategy\/en\/\" target=\"_blank\">http:\/\/who.int\/tb\/post2015_strategy\/en\/<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 22 August 2016<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=WHO+End+TB+Strategy%3A+Global+strategy+and+targets+for+tuberculosis+prevention%2C+care+and+control+after+2015&rft.atitle=&rft.aulast=World+Health+Organization&rft.au=World+Health+Organization&rft.date=2015&rft_id=http%3A%2F%2Fwho.int%2Ftb%2Fpost2015_strategy%2Fen%2F&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WHOGlobalTub15-2\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WHOGlobalTub15_2-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">World Health Organization (2015). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/apps.who.int\/iris\/bitstream\/10665\/191102\/1\/9789241565059_eng.pdf\" target=\"_blank\">\"Global Tuberculosis Report 2015\"<\/a> (PDF). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a> 9789241565059<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/apps.who.int\/iris\/bitstream\/10665\/191102\/1\/9789241565059_eng.pdf\" target=\"_blank\">http:\/\/apps.who.int\/iris\/bitstream\/10665\/191102\/1\/9789241565059_eng.pdf<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 22 August 2016<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Global+Tuberculosis+Report+2015&rft.atitle=&rft.aulast=World+Health+Organization&rft.au=World+Health+Organization&rft.date=2015&rft.isbn=9789241565059&rft_id=http%3A%2F%2Fapps.who.int%2Firis%2Fbitstream%2F10665%2F191102%2F1%2F9789241565059_eng.pdf&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-KuznetsovTwo14-3\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-KuznetsovTwo14_3-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Kuznetsov, V.N.; Grjibovski, A.M.; Mariandyshev, A.O. et al. (2014). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4147085\" target=\"_blank\">\"Two vicious circles contributing to a diagnostic delay for tuberculosis patients in Arkhangelsk\"<\/a>. <i>Emerging Health Threats Journal<\/i> <b>7<\/b>: 24909. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.3402%2Fehtj.v7.24909\" target=\"_blank\">10.3402\/ehtj.v7.24909<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4147085\/\" target=\"_blank\">PMC4147085<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/25163673\" target=\"_blank\">25163673<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4147085\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4147085<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Two+vicious+circles+contributing+to+a+diagnostic+delay+for+tuberculosis+patients+in+Arkhangelsk&rft.jtitle=Emerging+Health+Threats+Journal&rft.aulast=Kuznetsov%2C+V.N.%3B+Grjibovski%2C+A.M.%3B+Mariandyshev%2C+A.O.+et+al.&rft.au=Kuznetsov%2C+V.N.%3B+Grjibovski%2C+A.M.%3B+Mariandyshev%2C+A.O.+et+al.&rft.date=2014&rft.volume=7&rft.pages=24909&rft_id=info:doi\/10.3402%2Fehtj.v7.24909&rft_id=info:pmc\/PMC4147085&rft_id=info:pmid\/25163673&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4147085&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-AlemnjiStrengthen14-4\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-AlemnjiStrengthen14_4-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Alemnji, G.A.; Zeh, C.; Yao, K.; Fonjungo, P.N. (2014). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4826025\" target=\"_blank\">\"Strengthening national health laboratories in sub-Saharan Africa: a decade of remarkable progress\"<\/a>. <i>Tropical Medicine & International Health<\/i> <b>19<\/b> (4): 450-8. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1111%2Ftmi.12269\" target=\"_blank\">10.1111\/tmi.12269<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4826025\/\" target=\"_blank\">PMC4826025<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/24506521\" target=\"_blank\">24506521<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4826025\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4826025<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Strengthening+national+health+laboratories+in+sub-Saharan+Africa%3A+a+decade+of+remarkable+progress&rft.jtitle=Tropical+Medicine+%26+International+Health&rft.aulast=Alemnji%2C+G.A.%3B+Zeh%2C+C.%3B+Yao%2C+K.%3B+Fonjungo%2C+P.N.&rft.au=Alemnji%2C+G.A.%3B+Zeh%2C+C.%3B+Yao%2C+K.%3B+Fonjungo%2C+P.N.&rft.date=2014&rft.volume=19&rft.issue=4&rft.pages=450-8&rft_id=info:doi\/10.1111%2Ftmi.12269&rft_id=info:pmc\/PMC4826025&rft_id=info:pmid\/24506521&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4826025&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-Gershy-DametTheWorld10-5\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-Gershy-DametTheWorld10_5-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Gershy-Damet, G.M.; Rotz, P.; Cross, D. et al. (2010). \"The World Health Organization African region laboratory accreditation process: Improving the quality of laboratory systems in the African region\". <i>American Journal of Clinical Pathology<\/i> <b>134<\/b> (3): 393-400. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1309%2FAJCPTUUC2V1WJQBM\" target=\"_blank\">10.1309\/AJCPTUUC2V1WJQBM<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/20716795\" target=\"_blank\">20716795<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+World+Health+Organization+African+region+laboratory+accreditation+process%3A+Improving+the+quality+of+laboratory+systems+in+the+African+region&rft.jtitle=American+Journal+of+Clinical+Pathology&rft.aulast=Gershy-Damet%2C+G.M.%3B+Rotz%2C+P.%3B+Cross%2C+D.+et+al.&rft.au=Gershy-Damet%2C+G.M.%3B+Rotz%2C+P.%3B+Cross%2C+D.+et+al.&rft.date=2010&rft.volume=134&rft.issue=3&rft.pages=393-400&rft_id=info:doi\/10.1309%2FAJCPTUUC2V1WJQBM&rft_id=info:pmid\/20716795&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-PeterImpact10-6\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-PeterImpact10_6-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Peter, T.F.; Rotz, P.D.; Blair, D.H. et al. (2010). \"Impact of laboratory accreditation on patient care and the health system\". <i>American Journal of Clinical Pathology<\/i> <b>134<\/b> (4): 550-5. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1309%2FAJCPH1SKQ1HNWGHF\" target=\"_blank\">10.1309\/AJCPH1SKQ1HNWGHF<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/20855635\" target=\"_blank\">20855635<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Impact+of+laboratory+accreditation+on+patient+care+and+the+health+system&rft.jtitle=American+Journal+of+Clinical+Pathology&rft.aulast=Peter%2C+T.F.%3B+Rotz%2C+P.D.%3B+Blair%2C+D.H.+et+al.&rft.au=Peter%2C+T.F.%3B+Rotz%2C+P.D.%3B+Blair%2C+D.H.+et+al.&rft.date=2010&rft.volume=134&rft.issue=4&rft.pages=550-5&rft_id=info:doi\/10.1309%2FAJCPH1SKQ1HNWGHF&rft_id=info:pmid\/20855635&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-YaoImproving10-7\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-YaoImproving10_7-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Yao, K.; McKinney, B.; Murphy, A. et al. (2010). \"Improving quality management systems of laboratories in developing countries: An innovative training approach to accelerate laboratory accreditation\". <i>American Journal of Clinical Pathology<\/i> <b>134<\/b> (3): 401\u20139. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1309%2FAJCPNBBL53FWUIQJ\" target=\"_blank\">10.1309\/AJCPNBBL53FWUIQJ<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/20716796\" target=\"_blank\">20716796<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Improving+quality+management+systems+of+laboratories+in+developing+countries%3A+An+innovative+training+approach+to+accelerate+laboratory+accreditation&rft.jtitle=American+Journal+of+Clinical+Pathology&rft.aulast=Yao%2C+K.%3B+McKinney%2C+B.%3B+Murphy%2C+A.+et+al.&rft.au=Yao%2C+K.%3B+McKinney%2C+B.%3B+Murphy%2C+A.+et+al.&rft.date=2010&rft.volume=134&rft.issue=3&rft.pages=401%E2%80%939&rft_id=info:doi\/10.1309%2FAJCPNBBL53FWUIQJ&rft_id=info:pmid\/20716796&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-YaoEvidence14-8\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-YaoEvidence14_8-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Yao, K.; Luman, E.T.; SLMTA Collaborating Authors (2014). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4706175\" target=\"_blank\">\"Evidence from 617 laboratories in 47 countries for SLMTA-driven improvement in quality management systems\"<\/a>. <i>African Journal of Laboratory Medicine<\/i> <b>3<\/b> (3): 262. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.4102%2Fajlm.v3i2.262\" target=\"_blank\">10.4102\/ajlm.v3i2.262<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4706175\/\" target=\"_blank\">PMC4706175<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/26753132\" target=\"_blank\">26753132<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4706175\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4706175<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Evidence+from+617+laboratories+in+47+countries+for+SLMTA-driven+improvement+in+quality+management+systems&rft.jtitle=African+Journal+of+Laboratory+Medicine&rft.aulast=Yao%2C+K.%3B+Luman%2C+E.T.%3B+SLMTA+Collaborating+Authors&rft.au=Yao%2C+K.%3B+Luman%2C+E.T.%3B+SLMTA+Collaborating+Authors&rft.date=2014&rft.volume=3&rft.issue=3&rft.pages=262&rft_id=info:doi\/10.4102%2Fajlm.v3i2.262&rft_id=info:pmc\/PMC4706175&rft_id=info:pmid\/26753132&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4706175&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-SANASDirectory15-9\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-SANASDirectory15_9-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"http:\/\/home.sanas.co.za\/?page_id=38\" target=\"_blank\">\"Directory of Accredited Facilities\"<\/a>. South African National Accreditation System. 2015<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/home.sanas.co.za\/?page_id=38\" target=\"_blank\">http:\/\/home.sanas.co.za\/?page_id=38<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 22 August 2016<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Directory+of+Accredited+Facilities&rft.atitle=&rft.date=2015&rft.pub=South+African+National+Accreditation+System&rft_id=http%3A%2F%2Fhome.sanas.co.za%2F%3Fpage_id%3D38&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-IPACDirectorio-10\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-IPACDirectorio_10-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ipac.pt\/pesquisa\/acredita.asp\" target=\"_blank\">\"Direct\u00f3rio de Entidades Acreditradas\"<\/a>. Instituto Portugu\u00eas de Acredita\u00e7\u00e3o<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.ipac.pt\/pesquisa\/acredita.asp\" target=\"_blank\">http:\/\/www.ipac.pt\/pesquisa\/acredita.asp<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 19 January 2016<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Direct%C3%B3rio+de+Entidades+Acreditradas&rft.atitle=&rft.pub=Instituto+Portugu%C3%AAs+de+Acredita%C3%A7%C3%A3o&rft_id=http%3A%2F%2Fwww.ipac.pt%2Fpesquisa%2Facredita.asp&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-RidderhofRoles07-11\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-RidderhofRoles07_11-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Ridderhof, J.C.; van Deun, A.; Kam, K.M. et al. (2007). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2636656\" target=\"_blank\">\"Roles of laboratories and laboratory systems in effective tuberculosis programmes\"<\/a>. <i>Bulletin of the World Health Organization<\/i> <b>85<\/b> (5): 354-9. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC2636656\/\" target=\"_blank\">PMC2636656<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/17639219\" target=\"_blank\">17639219<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2636656\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2636656<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Roles+of+laboratories+and+laboratory+systems+in+effective+tuberculosis+programmes&rft.jtitle=Bulletin+of+the+World+Health+Organization&rft.aulast=Ridderhof%2C+J.C.%3B+van+Deun%2C+A.%3B+Kam%2C+K.M.+et+al.&rft.au=Ridderhof%2C+J.C.%3B+van+Deun%2C+A.%3B+Kam%2C+K.M.+et+al.&rft.date=2007&rft.volume=85&rft.issue=5&rft.pages=354-9&rft_id=info:pmc\/PMC2636656&rft_id=info:pmid\/17639219&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC2636656&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-RaizadaAMulti14-12\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-RaizadaAMulti14_12-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Raizada, N.; Sachdeva, K.S.; Chauhan, D.S. et al. (2014). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3929364\" target=\"_blank\">\"A multi-site validation in India of the line probe assay for the rapid diagnosis of multi-drug resistant tuberculosis directly from sputum specimens\"<\/a>. <i>PLoS One<\/i> <b>9<\/b> (2): e88626. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1371%2Fjournal.pone.0088626\" target=\"_blank\">10.1371\/journal.pone.0088626<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3929364\/\" target=\"_blank\">PMC3929364<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/24586360\" target=\"_blank\">24586360<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3929364\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3929364<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+multi-site+validation+in+India+of+the+line+probe+assay+for+the+rapid+diagnosis+of+multi-drug+resistant+tuberculosis+directly+from+sputum+specimens&rft.jtitle=PLoS+One&rft.aulast=Raizada%2C+N.%3B+Sachdeva%2C+K.S.%3B+Chauhan%2C+D.S.+et+al.&rft.au=Raizada%2C+N.%3B+Sachdeva%2C+K.S.%3B+Chauhan%2C+D.S.+et+al.&rft.date=2014&rft.volume=9&rft.issue=2&rft.pages=e88626&rft_id=info:doi\/10.1371%2Fjournal.pone.0088626&rft_id=info:pmc\/PMC3929364&rft_id=info:pmid\/24586360&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3929364&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-RaizadaFeasib14-13\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-RaizadaFeasib14_13-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Raizada, N.; Sachdeva, K.S.; Sreenivas, A. et al. (2014). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3935858\" target=\"_blank\">\"Feasibility of decentralised deployment of Xpert MTB\/RIF test at lower level of health system in India\"<\/a>. <i>PLoS One<\/i> <b>9<\/b> (2): e89301. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1371%2Fjournal.pone.0089301\" target=\"_blank\">10.1371\/journal.pone.0089301<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3935858\/\" target=\"_blank\">PMC3935858<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/24586675\" target=\"_blank\">24586675<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3935858\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3935858<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Feasibility+of+decentralised+deployment+of+Xpert+MTB%2FRIF+test+at+lower+level+of+health+system+in+India&rft.jtitle=PLoS+One&rft.aulast=Raizada%2C+N.%3B+Sachdeva%2C+K.S.%3B+Sreenivas%2C+A.+et+al.&rft.au=Raizada%2C+N.%3B+Sachdeva%2C+K.S.%3B+Sreenivas%2C+A.+et+al.&rft.date=2014&rft.volume=9&rft.issue=2&rft.pages=e89301&rft_id=info:doi\/10.1371%2Fjournal.pone.0089301&rft_id=info:pmc\/PMC3935858&rft_id=info:pmid\/24586675&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3935858&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-AlbertRapid10-14\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-AlbertRapid10_14-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Albert, H.; Bwanga, F.; Mukkada, S. et al. (2010). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2841659\" target=\"_blank\">\"Rapid screening of MDR-TB using molecular Line Probe Assay is feasible in Uganda\"<\/a>. <i>BMC Infectious Diseases<\/i> <b>10<\/b>: 41. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1186%2F1471-2334-10-41\" target=\"_blank\">10.1186\/1471-2334-10-41<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC2841659\/\" target=\"_blank\">PMC2841659<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/20187922\" target=\"_blank\">20187922<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2841659\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2841659<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Rapid+screening+of+MDR-TB+using+molecular+Line+Probe+Assay+is+feasible+in+Uganda&rft.jtitle=BMC+Infectious+Diseases&rft.aulast=Albert%2C+H.%3B+Bwanga%2C+F.%3B+Mukkada%2C+S.+et+al.&rft.au=Albert%2C+H.%3B+Bwanga%2C+F.%3B+Mukkada%2C+S.+et+al.&rft.date=2010&rft.volume=10&rft.pages=41&rft_id=info:doi\/10.1186%2F1471-2334-10-41&rft_id=info:pmc\/PMC2841659&rft_id=info:pmid\/20187922&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC2841659&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-AlbertPerform10-15\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-AlbertPerform10_15-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Albert, H.; Manabe, Y.; Lukyamuzi, G. et al. (2010). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3011008\" target=\"_blank\">\"Performance of three LED-based fluorescence microscopy systems for detection of tuberculosis in Uganda\"<\/a>. <i>PLoS One<\/i> <b>5<\/b> (12): e15206. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1371%2Fjournal.pone.0015206\" target=\"_blank\">10.1371\/journal.pone.0015206<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3011008\/\" target=\"_blank\">PMC3011008<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/21203398\" target=\"_blank\">21203398<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3011008\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3011008<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Performance+of+three+LED-based+fluorescence+microscopy+systems+for+detection+of+tuberculosis+in+Uganda&rft.jtitle=PLoS+One&rft.aulast=Albert%2C+H.%3B+Manabe%2C+Y.%3B+Lukyamuzi%2C+G.+et+al.&rft.au=Albert%2C+H.%3B+Manabe%2C+Y.%3B+Lukyamuzi%2C+G.+et+al.&rft.date=2010&rft.volume=5&rft.issue=12&rft.pages=e15206&rft_id=info:doi\/10.1371%2Fjournal.pone.0015206&rft_id=info:pmc\/PMC3011008&rft_id=info:pmid\/21203398&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3011008&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ParamasivanExperience10-16\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-ParamasivanExperience10_16-0\" rel=\"external_link\">16.0<\/a><\/sup> <sup><a href=\"#cite_ref-ParamasivanExperience10_16-1\" rel=\"external_link\">16.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Paramasivan, C.N.; Lee, E.; Kao, K. et al. (2010). \"Experience establishing tuberculosis laboratory capacity in a developing country setting\". <i>International Journal of Tuberculosis and Lung Disease<\/i> <b>13<\/b> (1): 59-64. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/20003696\" target=\"_blank\">20003696<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Experience+establishing+tuberculosis+laboratory+capacity+in+a+developing+country+setting&rft.jtitle=International+Journal+of+Tuberculosis+and+Lung+Disease&rft.aulast=Paramasivan%2C+C.N.%3B+Lee%2C+E.%3B+Kao%2C+K.+et+al.&rft.au=Paramasivan%2C+C.N.%3B+Lee%2C+E.%3B+Kao%2C+K.+et+al.&rft.date=2010&rft.volume=13&rft.issue=1&rft.pages=59-64&rft_id=info:pmid\/20003696&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GLIStepwise-17\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-GLIStepwise_17-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.gliquality.org\/\" target=\"_blank\">\"GLI Stepwise Process towards TB Laboratory Accreditation\"<\/a>. Global Laboratory Initiative<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.gliquality.org\/\" target=\"_blank\">http:\/\/www.gliquality.org\/<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 22 August 2016<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=GLI+Stepwise+Process+towards+TB+Laboratory+Accreditation&rft.atitle=&rft.pub=Global+Laboratory+Initiative&rft_id=http%3A%2F%2Fwww.gliquality.org%2F&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-FIND_TBLab16-18\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-FIND_TBLab16_18-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.finddx.org\/wp-content\/uploads\/2016\/07\/NEW-TB-Harmonized-Checklist-v2.1-2-2016.pdf\" target=\"_blank\">\"TB Laboratory Quality Management Systems Towards Accreditation Harmonized Checklist\"<\/a> (PDF). FIND. February 2016<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.finddx.org\/wp-content\/uploads\/2016\/07\/NEW-TB-Harmonized-Checklist-v2.1-2-2016.pdf\" target=\"_blank\">https:\/\/www.finddx.org\/wp-content\/uploads\/2016\/07\/NEW-TB-Harmonized-Checklist-v2.1-2-2016.pdf<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 19 January 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=TB+Laboratory+Quality+Management+Systems+Towards+Accreditation+Harmonized+Checklist&rft.atitle=&rft.date=February+2016&rft.pub=FIND&rft_id=https%3A%2F%2Fwww.finddx.org%2Fwp-content%2Fuploads%2F2016%2F07%2FNEW-TB-Harmonized-Checklist-v2.1-2-2016.pdf&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WHOAFRO-SLIPTA-19\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WHOAFRO-SLIPTA_19-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.afro.who.int\/en\/downloads\/cat_view\/1501-english\/787-blood-safety.html\" target=\"_blank\">\"WHO AFRO SLIPTA Checklist\"<\/a>. African Society for Laboratory Medicine. 2007<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.afro.who.int\/en\/downloads\/cat_view\/1501-english\/787-blood-safety.html\" target=\"_blank\">http:\/\/www.afro.who.int\/en\/downloads\/cat_view\/1501-english\/787-blood-safety.html<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 19 January 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=WHO+AFRO+SLIPTA+Checklist&rft.atitle=&rft.date=2007&rft.pub=African+Society+for+Laboratory+Medicine&rft_id=http%3A%2F%2Fwww.afro.who.int%2Fen%2Fdownloads%2Fcat_view%2F1501-english%2F787-blood-safety.html&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MarutaHarmonizing12-20\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-MarutaHarmonizing12_20-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Maruta, T.; Albert, H.; Hove, P. et al. (2012). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/citeweb.info\/20121448643\" target=\"_blank\">\"Harmonizing quality improvement of TB laboratories with generic accreditation initiatives\"<\/a>. <i>ASLM Conference Proceedings<\/i><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/citeweb.info\/20121448643\" target=\"_blank\">http:\/\/citeweb.info\/20121448643<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Harmonizing+quality+improvement+of+TB+laboratories+with+generic+accreditation+initiatives&rft.jtitle=ASLM+Conference+Proceedings&rft.aulast=Maruta%2C+T.%3B+Albert%2C+H.%3B+Hove%2C+P.+et+al.&rft.au=Maruta%2C+T.%3B+Albert%2C+H.%3B+Hove%2C+P.+et+al.&rft.date=2012&rft_id=http%3A%2F%2Fciteweb.info%2F20121448643&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-AlbertStrengthen14-21\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-AlbertStrengthen14_21-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Albert, H. (2014). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/html5.slideonline.eu\/event\/14UNION\" target=\"_blank\">\"Strengthening laboratory management toward accreditation programme: Transforming the lab landscape in developing countries and customisation for labs\"<\/a>. <i>Proceedings of the 45th Union World Conference on Lung Health<\/i><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/html5.slideonline.eu\/event\/14UNION\" target=\"_blank\">http:\/\/html5.slideonline.eu\/event\/14UNION<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Strengthening+laboratory+management+toward+accreditation+programme%3A+Transforming+the+lab+landscape+in+developing+countries+and+customisation+for+labs&rft.jtitle=Proceedings+of+the+45th+Union+World+Conference+on+Lung+Health&rft.aulast=Albert%2C+H.&rft.au=Albert%2C+H.&rft.date=2014&rft_id=http%3A%2F%2Fhtml5.slideonline.eu%2Fevent%2F14UNION&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MarutaTraining14-22\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-MarutaTraining14_22-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Maruta, T.; Yao, K.; Ndlovu, N. et al. (2014). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4703333\" target=\"_blank\">\"Training-of-trainers: A strategy to build country capacity for SLMTA expansion and sustainability\"<\/a>. <i>African Journal of Laboratory Medicine<\/i> <b>3<\/b> (2): 196. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.4102%2Fajlm.v3i2.196\" target=\"_blank\">10.4102\/ajlm.v3i2.196<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4703333\/\" target=\"_blank\">PMC4703333<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/26753131\" target=\"_blank\">26753131<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4703333\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4703333<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Training-of-trainers%3A+A+strategy+to+build+country+capacity+for+SLMTA+expansion+and+sustainability&rft.jtitle=African+Journal+of+Laboratory+Medicine&rft.aulast=Maruta%2C+T.%3B+Yao%2C+K.%3B+Ndlovu%2C+N.+et+al.&rft.au=Maruta%2C+T.%3B+Yao%2C+K.%3B+Ndlovu%2C+N.+et+al.&rft.date=2014&rft.volume=3&rft.issue=2&rft.pages=196&rft_id=info:doi\/10.4102%2Fajlm.v3i2.196&rft_id=info:pmc\/PMC4703333&rft_id=info:pmid\/26753131&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4703333&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WHOLab11-23\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WHOLab11_23-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">World Health Organization (2011). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.who.int\/ihr\/publications\/lqms\/en\/\" target=\"_blank\">\"Laboratory Quality Management System: Handbook\"<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a> 9789241548274<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.who.int\/ihr\/publications\/lqms\/en\/\" target=\"_blank\">http:\/\/www.who.int\/ihr\/publications\/lqms\/en\/<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 22 August 2016<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Laboratory+Quality+Management+System%3A+Handbook&rft.atitle=&rft.aulast=World+Health+Organization&rft.au=World+Health+Organization&rft.date=2011&rft.isbn=9789241548274&rft_id=http%3A%2F%2Fwww.who.int%2Fihr%2Fpublications%2Flqms%2Fen%2F&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-FINDOnlineTraining-24\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-FINDOnlineTraining_24-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Foundation for Innovative New Diagnostics. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/finddiagnostics-training.org\/moodle\/\" target=\"_blank\">\"FIND Online Training\"<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/finddiagnostics-training.org\/moodle\/\" target=\"_blank\">http:\/\/finddiagnostics-training.org\/moodle\/<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 22 August 2016<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=FIND+Online+Training&rft.atitle=&rft.aulast=Foundation+for+Innovative+New+Diagnostics&rft.au=Foundation+for+Innovative+New+Diagnostics&rft_id=http%3A%2F%2Ffinddiagnostics-training.org%2Fmoodle%2F&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-AlbertImplement17-25\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-AlbertImplement17_25-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Albert, H.; de Dieu Iragena, J.; Kao, K. et al. (2017). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5523922\" target=\"_blank\">\"Implementation of quality management systems and progress towards accreditation of National Tuberculosis Reference Laboratories in Africa\"<\/a>. <i>African Journal of Laboratory Medicine<\/i> <b>6<\/b> (2): 490. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.4102%2Fajlm.v6i2.490\" target=\"_blank\">10.4102\/ajlm.v6i2.490<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC5523922\/\" target=\"_blank\">PMC5523922<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/28879161\" target=\"_blank\">28879161<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5523922\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5523922<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Implementation+of+quality+management+systems+and+progress+towards+accreditation+of+National+Tuberculosis+Reference+Laboratories+in+Africa&rft.jtitle=African+Journal+of+Laboratory+Medicine&rft.aulast=Albert%2C+H.%3B+de+Dieu+Iragena%2C+J.%3B+Kao%2C+K.+et+al.&rft.au=Albert%2C+H.%3B+de+Dieu+Iragena%2C+J.%3B+Kao%2C+K.+et+al.&rft.date=2017&rft.volume=6&rft.issue=2&rft.pages=490&rft_id=info:doi\/10.4102%2Fajlm.v6i2.490&rft_id=info:pmc\/PMC5523922&rft_id=info:pmid\/28879161&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC5523922&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MarutaImpact12-26\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-MarutaImpact12_26-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Maruta, T.; Motenbang, D.; Mathabo, L. et al. (2012). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5644515\" target=\"_blank\">\"Impact of mentorship on WHO-AFRO Strengthening Laboratory Quality Improvement Process Towards Accreditation (SLIPTA)\"<\/a>. <i>African Journal of Laboratory Medicine<\/i> <b>1<\/b> (1): 6. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.4102%2Fajlm.v1i1.6\" target=\"_blank\">10.4102\/ajlm.v1i1.6<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC5644515\/\" target=\"_blank\">PMC5644515<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/29062726\" target=\"_blank\">29062726<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5644515\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5644515<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Impact+of+mentorship+on+WHO-AFRO+Strengthening+Laboratory+Quality+Improvement+Process+Towards+Accreditation+%28SLIPTA%29&rft.jtitle=African+Journal+of+Laboratory+Medicine&rft.aulast=Maruta%2C+T.%3B+Motenbang%2C+D.%3B+Mathabo%2C+L.+et+al.&rft.au=Maruta%2C+T.%3B+Motenbang%2C+D.%3B+Mathabo%2C+L.+et+al.&rft.date=2012&rft.volume=1&rft.issue=1&rft.pages=6&rft_id=info:doi\/10.4102%2Fajlm.v1i1.6&rft_id=info:pmc\/PMC5644515&rft_id=info:pmid\/29062726&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC5644515&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-NzombeMaximising14-27\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-NzombeMaximising14_27-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Nzombe, P.; Luman, E.T.; Shumba, E. et al. (2014). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5637805\" target=\"_blank\">\"Maximising mentorship: Variations in laboratory mentorship models implemented in Zimbabwe\"<\/a>. <i>African Journal of Laboratory Medicine<\/i> <b>3<\/b> (2): 241. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.4102%2Fajlm.v3i2.241\" target=\"_blank\">10.4102\/ajlm.v3i2.241<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC5637805\/\" target=\"_blank\">PMC5637805<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/29043196\" target=\"_blank\">29043196<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5637805\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5637805<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Maximising+mentorship%3A+Variations+in+laboratory+mentorship+models+implemented+in+Zimbabwe&rft.jtitle=African+Journal+of+Laboratory+Medicine&rft.aulast=Nzombe%2C+P.%3B+Luman%2C+E.T.%3B+Shumba%2C+E.+et+al.&rft.au=Nzombe%2C+P.%3B+Luman%2C+E.T.%3B+Shumba%2C+E.+et+al.&rft.date=2014&rft.volume=3&rft.issue=2&rft.pages=241&rft_id=info:doi\/10.4102%2Fajlm.v3i2.241&rft_id=info:pmc\/PMC5637805&rft_id=info:pmid\/29043196&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC5637805&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MakokhaUsing14-28\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-MakokhaUsing14_28-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Makokha, E.P.; Mwalili, S.; Basiye, F.L. et al. (2014). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5637804\" target=\"_blank\">\"Using standard and institutional mentorship models to implement SLMTA in Kenya\"<\/a>. <i>African Journal of Laboratory Medicine<\/i> <b>3<\/b> (2): 220. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.4102%2Fajlm.v3i2.220\" target=\"_blank\">10.4102\/ajlm.v3i2.220<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC5637804\/\" target=\"_blank\">PMC5637804<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/29043191\" target=\"_blank\">29043191<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5637804\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5637804<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Using+standard+and+institutional+mentorship+models+to+implement+SLMTA+in+Kenya&rft.jtitle=African+Journal+of+Laboratory+Medicine&rft.aulast=Makokha%2C+E.P.%3B+Mwalili%2C+S.%3B+Basiye%2C+F.L.+et+al.&rft.au=Makokha%2C+E.P.%3B+Mwalili%2C+S.%3B+Basiye%2C+F.L.+et+al.&rft.date=2014&rft.volume=3&rft.issue=2&rft.pages=220&rft_id=info:doi\/10.4102%2Fajlm.v3i2.220&rft_id=info:pmc\/PMC5637804&rft_id=info:pmid\/29043191&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC5637804&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MarutaSetting13-29\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-MarutaSetting13_29-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Maruta, T.; Rotz, P.; Peter, T. et al. (2013). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5637775\" target=\"_blank\">\"Setting up a structured laboratory mentoring programme\"<\/a>. <i>African Journal of Laboratory Medicine<\/i> <b>2<\/b> (1): 77. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.4102%2Fajlm.v2i1.77\" target=\"_blank\">10.4102\/ajlm.v2i1.77<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC5637775\/\" target=\"_blank\">PMC5637775<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/29043168\" target=\"_blank\">29043168<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5637775\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5637775<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Setting+up+a+structured+laboratory+mentoring+programme&rft.jtitle=African+Journal+of+Laboratory+Medicine&rft.aulast=Maruta%2C+T.%3B+Rotz%2C+P.%3B+Peter%2C+T.+et+al.&rft.au=Maruta%2C+T.%3B+Rotz%2C+P.%3B+Peter%2C+T.+et+al.&rft.date=2013&rft.volume=2&rft.issue=1&rft.pages=77&rft_id=info:doi\/10.4102%2Fajlm.v2i1.77&rft_id=info:pmc\/PMC5637775&rft_id=info:pmid\/29043168&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC5637775&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-NdihokubwayoImplement16-30\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-NdihokubwayoImplement16_30-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Ndihokubwayo, J.B.; Maruta, T.; Ndlovu, N. et al. (2016). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5436392\" target=\"_blank\">\"Implementation of the World Health Organization Regional Office for Africa Stepwise Laboratory Quality Improvement Process Towards Accreditation\"<\/a>. <i>African Journal of Laboratory Medicine<\/i> <b>5<\/b> (1): 280. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.4102%2Fajlm.v5i1.280\" target=\"_blank\">10.4102\/ajlm.v5i1.280<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC5436392\/\" target=\"_blank\">PMC5436392<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/28879103\" target=\"_blank\">28879103<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5436392\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5436392<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Implementation+of+the+World+Health+Organization+Regional+Office+for+Africa+Stepwise+Laboratory+Quality+Improvement+Process+Towards+Accreditation&rft.jtitle=African+Journal+of+Laboratory+Medicine&rft.aulast=Ndihokubwayo%2C+J.B.%3B+Maruta%2C+T.%3B+Ndlovu%2C+N.+et+al.&rft.au=Ndihokubwayo%2C+J.B.%3B+Maruta%2C+T.%3B+Ndlovu%2C+N.+et+al.&rft.date=2016&rft.volume=5&rft.issue=1&rft.pages=280&rft_id=info:doi\/10.4102%2Fajlm.v5i1.280&rft_id=info:pmc\/PMC5436392&rft_id=info:pmid\/28879103&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC5436392&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ShumbaWeighing14-31\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ShumbaWeighing14_31-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Shumba, E.; Nzombe, P.; Mbinda, A. et al. (2014). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5637799\" target=\"_blank\">\"Weighing the costs: Implementing the SLMTA programme in Zimbabwe using internal versus external facilitators\"<\/a>. <i>African Journal of Laboratory Medicine<\/i> <b>3<\/b> (2): 248. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.4102%2Fajlm.v3i2.248\" target=\"_blank\">10.4102\/ajlm.v3i2.248<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC5637799\/\" target=\"_blank\">PMC5637799<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/29043197\" target=\"_blank\">29043197<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5637799\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5637799<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Weighing+the+costs%3A+Implementing+the+SLMTA+programme+in+Zimbabwe+using+internal+versus+external+facilitators&rft.jtitle=African+Journal+of+Laboratory+Medicine&rft.aulast=Shumba%2C+E.%3B+Nzombe%2C+P.%3B+Mbinda%2C+A.+et+al.&rft.au=Shumba%2C+E.%3B+Nzombe%2C+P.%3B+Mbinda%2C+A.+et+al.&rft.date=2014&rft.volume=3&rft.issue=2&rft.pages=248&rft_id=info:doi\/10.4102%2Fajlm.v3i2.248&rft_id=info:pmc\/PMC5637799&rft_id=info:pmid\/29043197&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC5637799&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<\/ol><\/div>\n<h2><span class=\"mw-headline\" id=\"Notes\">Notes<\/span><\/h2>\n<p>This presentation is faithful to the original, with only a few minor changes to grammar, spelling, and presentation, including the addition of PMCID and DOI when they were missing from the original reference. Reference 19 is a dead URL, and an archived version could not be located. Reference 21 can't be found at the author-supplied URL.\n<\/p>\n<!-- \nNewPP limit report\nCached time: 20181214185734\nCache expiry: 86400\nDynamic content: false\nCPU time usage: 0.733 seconds\nReal time usage: 0.767 seconds\nPreprocessor visited node count: 24764\/1000000\nPreprocessor generated node count: 34079\/1000000\nPost\u2010expand include size: 223493\/2097152 bytes\nTemplate argument size: 74778\/2097152 bytes\nHighest expansion depth: 18\/40\nExpensive parser function count: 0\/100\n-->\n\n<!-- \nTransclusion expansion time report (%,ms,calls,template)\n100.00% 725.704 1 - -total\n 86.74% 629.468 1 - Template:Reflist\n 76.15% 552.590 31 - Template:Citation\/core\n 59.32% 430.459 22 - Template:Cite_journal\n 20.76% 150.676 9 - Template:Cite_web\n 10.04% 72.830 56 - Template:Citation\/identifier\n 8.62% 62.522 1 - Template:Infobox_journal_article\n 8.24% 59.799 1 - Template:Infobox\n 4.96% 35.985 80 - Template:Infobox\/row\n 4.26% 30.885 31 - Template:Citation\/make_link\n-->\n\n<!-- Saved in parser cache with key limswiki:pcache:idhash:10397-0!*!0!!en!5!* and timestamp 20181214185733 and revision id 32402\n -->\n<\/div><div class=\"printfooter\">Source: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation\">https:\/\/www.limswiki.org\/index.php\/Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation<\/a><\/div>\n\t\t\t\t\t\t\t\t\t\t<!-- end content -->\n\t\t\t\t\t\t\t\t\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t\t<!-- end of the left (by default at least) column -->\n\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t\t\n\t\t<\/div>\n\t\t\n\n<\/body>","15471c0a609cecac0db384f57371da08_images":["https:\/\/www.limswiki.org\/images\/8\/8a\/Tab1_Albert_AfricanJofLabMed2017_6-2.jpg","https:\/\/www.limswiki.org\/images\/2\/28\/Fig1_Albert_AfricanJofLabMed2017_6-2.jpg","https:\/\/www.limswiki.org\/images\/1\/15\/Tab2_Albert_AfricanJofLabMed2017_6-2.jpg","https:\/\/www.limswiki.org\/images\/b\/b0\/Fig2_Albert_AfricanJofLabMed2017_6-2.jpg","https:\/\/www.limswiki.org\/images\/5\/5f\/Tab3_Albert_AfricanJofLabMed2017_6-2.jpg"],"15471c0a609cecac0db384f57371da08_timestamp":1544813853,"478e0fc6bdbd74f64773b750f2c9edcc_type":"article","478e0fc6bdbd74f64773b750f2c9edcc_title":"SistematX, an online web-based cheminformatics tool for data management of secondary metabolites (Scotti et al. 2018)","478e0fc6bdbd74f64773b750f2c9edcc_url":"https:\/\/www.limswiki.org\/index.php\/Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites","478e0fc6bdbd74f64773b750f2c9edcc_plaintext":"\n\n\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\n\n\t\t\t\tJournal:SistematX, an online web-based cheminformatics tool for data management of secondary metabolites\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tFrom LIMSWiki\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\tJump to: navigation, search\n\n\t\t\t\t\t\n\t\t\t\t\tFull article title\n \nSistematX, an online web-based cheminformatics tool for data management of secondary metabolitesJournal\n \nMoleculesAuthor(s)\n \nScotti, Marcus T.; Herrera-Acevedo, Chonny; Oliveira, Tiago B.; Costa, Renan P.O.; de Oliveira Santos, Silas Y.K.;\r\nRodrigues, Ricardo P.; Scotti, Luciana; Da-Costa, Fernando B.Author affiliation(s)\n \nFederal University of Para\u00edba, Federal University of Sergipe, University of S\u00e3o PauloPrimary contact\n \nPhone: +55-83-99869-0415Year published\n \n2018Volume and issue\n \n23(1)Page(s)\n \n103DOI\n \n10.3390\/molecules23010103ISSN\n \n1420-3049Distribution license\n \nCreative Commons Attribution 4.0 InternationalWebsite\n \nhttp:\/\/www.mdpi.com\/1420-3049\/23\/1\/103\/htmDownload\n \nhttp:\/\/www.mdpi.com\/1420-3049\/23\/1\/103\/pdf (PDF)\n\nContents\n\n1 Abstract \n2 Introduction \n3 Results and discussion \n\n3.1 Utility and discussion \n3.2 Data management \n\n\n4 Material and methods \n\n4.1 Implementation \n\n\n5 Conclusions \n6 Acknowledgments \n7 Author contributions \n8 Conflicts of interest \n9 References \n10 Notes \n\n\n\nAbstract \nThe traditional work of a natural products researcher consists in large part of time-consuming experimental work, collecting biota to prepare, extracts to analyze, and innovative metabolites to identify. However, along this long scientific path, much information is lost or restricted to a specific niche. The large amounts of data already produced and the science of metabolomics reveal new questions: Are these compounds known or new? How fast can this information be obtained? To answer these and other relevant questions, an appropriate procedure to correctly store information on the data retrieved from the discovered metabolites is necessary. The SistematX (http:\/\/sistematx.ufpb.br) interface is implemented considering the following aspects: (a) the ability to search by structure, SMILES (Simplified Molecular-Input Line-Entry System) code, compound name, and species; (b) the ability to save chemical structures found by searching; (c) the ability to display compound data results, including important characteristics for natural products chemistry; and (d) the user's ability to find specific information for taxonomic rank (from family to species) of the plant from which the compound was isolated, the searched-for molecule, and the bibliographic reference and Global Positioning System (GPS) coordinates. The SistematX homepage allows the user to log into the data management area using a login name and password and gain access to administration pages. In this article, we introduce a modern and innovative web interface for the management of a secondary metabolite database. With its multi-platform design, it is able to be properly consulted via the internet and managed from any accredited computer. The interface provided by SistematX contains a wealth of useful information for the scientific community about natural products, highlighting the locations of species from which compounds are isolated.\nKeywords: SistematX, secondary metabolites, data management, online web-based tool\n\nIntroduction \nThe traditional work of a natural products researcher can be summarized as the collection of biological samples, preparation of extracts for biological screening or bioassay-guided fractionation, and isolation and purification of (bioactive or not) compounds. However, the first question that may arise is the following: are these compounds known or new? In addition, metabolomics studies have introduced a new question: how fast can this information be obtained?[1]\nThe stage of dereplication, a process known as the rapid characterization of previously known compounds in mixtures without their prior purification, has become a strategically important area for natural products research involved in screening programs in several commercial and non-commercial databases.[2][3][4] These databases can be searched with minimal information, such as structural chemical and biological data from compounds; however, dereplication now requires additional information, such as biogeographical and taxonomic information, or the presence of a certain compound (new or known) in other individuals of the same species, genus, subfamily, and family. This information can also help to reduce the number of hits during chemical identification by dereplication.\nLarge structure-based data collections, such as ChemSpider[5], PubChem[6], ChEBI[7], and ZINC[8] can be used for this purpose.[9][10] However, these databases are not specialized in secondary metabolite information that is valuable to the natural products researchers, for example, botanical occurrence and geographical localization. For this reason, a number of specialized natural products databases were developed that are commercially or freely available and only contain restricted information, for example, the Dictionary of Natural Products (DNP)[11], NAPRALERT[12], Marinlit for marine natural products[13], and Antibase for microorganisms and higher fungi materials. Nevertheless, none of these provide structural collections in a format that can be rapidly integrated into software such as ACD\/Structure Elucidator and others.[9]\nOther natural products databases provide natural products extracted from various resources and contain various associated information such as toxicity prediction, but so far, little or nothing is known about these resources, for example, SUPER NATURAL II.[14] Natural products databases exhibit a huge range of structural complexity and thus are expected to contribute to the ability of such databases to provide positive hits.[2][15] These structures are available in regional databases, for example NeBBEDB[16], SANCDB[17], TM-CM[18], TCM-Database@Taiwan[19], NANPDB[20], and TCMID.[21] Many have been used in virtual screening research studies. In addition to the database information described above that uses two-dimensional (2D) structures, several databases have selected methods and tools for generating three-dimensional (3D) structures of small organic molecules, often for use in structure-based drug design.\nIn addition, databases of natural products with a focus on metabolomic studies with relationships between species-metabolites include the KNApSAcK Family[22], TIPdb-3D[23], and AsterDB[24], which enable searches for chemical structures by plant species names and other taxonomic information. Nevertheless, some data are still lacking for the purpose of exact dereplication. Information such as exact mass and geographic data can be very important for this type of study.[25][26][27]\nIt is not enough simply to focus on the information contained in a database. A clean and user-friendly interface, fast search, and consistency between currently available operating systems (Microsoft Windows, Mac, and Linux) can be just as important. For this purpose, the SistematX software was developed to provide the abovementioned information for chemosystematics studies, dereplication, and botanical correlations.\n\nResults and discussion \nUtility and discussion \nThe SistematX homepage is shown in Figure 1A. After the user enters the website (http:\/\/sistematx.ufpb.br), the \u201cStructure search\u201d option is seen with the MarvinJS API (Application Programming Interface) at the top of the screen. Another three search options can be exhibited in the interface. The initial screen of the system also shows the SMILES (Simplified Molecular-Input Line-Entry System) code (Figure 1B), compound name (Figure 1C) and plant species search modes (Figure 1D).\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 1. SistematX homepage with different search options: (A) by structure; (B) by Simplified Molecular-Input Line-Entry System (SMILES); (C) by compound name; and (D) by plant species\n\n\n\nIn the first option, the user can perform the search using the drawn full structure or molecular skeleton, fragments, or substructures, which is important in cases when the user only knows a structural characteristic of the structure such as functional groups or when studies require structural similarity, structural groups, or families of compounds. It is possible to use the similarity search option that is currently available in SistematX. The search results page shows all compounds that correspond to the value above the cut off provided by the user in the decreasing order of similarity, showing the similarity values on the top. A substructure and similarity search is performed using a hashed fingerprint. Special molecular features are present in the query (e.g., stereochemistry, charge), only those targets match that also contain the feature. However, if a feature is missing from the query, it is not required to be missing.\nIn addition, it is possible to search by SMILES code, a chemical notation system capable of representing even the most complex organic compounds using a simple grammar that is very well known to organic chemistry researchers; for this reason we add this option separately from the structure search using the MarvinJS API, being friendly for one just to copy and paste the SMILES code (Figure 1B); by common (usual) name or IUPAC (International Union of Pure and Applied Chemistry) name (or part of one of these); and by species, although in this option, it is necessary to first insert the name of the genus (which presents an autocompletion option). After being selected, the system presents all species available for the user to select for the search.\nWhen performing a search, the mechanism generates a search results page (six results per page), using common names; if the compound does not have one, it shows the IUPAC name (Figure 2). The user can set the number of structure results per page. When a result is selected, the user has access to the data for that molecule, which are classified into six different groups (Figure 3).\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 2. SistematX results page\n\n\n\n\n\n\n\n\n\n\n\n\n Figure 3. SistematX screen for molecular data\n\n\n\nThe first group of results that appears is related to the structural representation of the searched molecule. The 2D structure is observed in the interface; on top, this appears as the option to amplify. After this is clicked, the system displays the visualization of the molecule in 2D and 3D (ChemDoodle, iChemLabs, Piscataway, NJ, USA) and an additional option for saving the 2D or 3D structure in an MDL (Molecular Design Limited, San Ramon, CA, USA) Molfile. The second type of result exhibited by the systems associated with compound identification, such as common name, SMILES code, IUPAC name, InChI (IUPAC International Chemical Identifier, Research Triangle Park, NC, USA) code, InChIKey code and CAS (Chemical Abstracts Service, Columbus, OH, USA) number. Except for the common name, which is optional and registered by the administrator, all parameters are provided by the JChem API.\nCompound data results include important characteristics for natural products chemistry. The class of secondary metabolite of the searched molecule and its skeleton provide information about its biosynthetic pathway and assists in chemosystematics and chemotaxonomic studies. Oxidation number (NOX), which is calculated based on the Hendrickson rules[28], has been fundamental in chemotaxonomy since Gottlieb related the oxidation grade of molecules to species evolution.[29] Molecular mass is calculated using the most abundant isotope of each element (exact mass) and the average atomic mass of each element (relative mass); these data are important for users working on purification processes and for structural elucidation of molecules, due to the mass information, which is essential for determining the purity of secondary metabolites.\nIn the botanical data field, the user can find specific information such as the taxonomic rank (from family to species) of the plant from which the compound was isolated, the searched molecule, and the bibliographic reference, which includes journal name, volume, page, and year. Because many different species can biosynthesize the same molecule, there is one register per species. Meanwhile, the biological data exhibit results obtained in studies related to the biological activity of the searched molecule, the type of activity, system, units, activity value and bibliographic reference are available in this section.\nPlant species have revealed clear genetic signals for local adaptation.[30] One species can synthesize a secondary metabolite depending on its location, and there are observed variations in compound concentrations at different sites. Because geographical data is an important parameter in natural products research, SistematX shows geographical coordinates (latitude and longitude) for a searched molecule and an approximate location of the species from which this metabolite was isolated. Using the Google Maps API, the user can observe the species location on the world map.\n\nData management \nOn the SistematX homepage, the user can also log into the data management area using login name and password (Figure 4A) and from there access the administration pages to edit or register new molecules. Once the corresponding information has been accepted, the data management interface appears.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 4. SistematX creates new registers through an administrator: (A) login and password option; (B) structure view and (C) molecule selection\n\n\n\nThe first requirement to register a new molecule is to insert the structure in the MarvinJS API (Figure 4B). Several methods can be used to accomplish this step: drawing, copying SMILES code, or importing the molecule in a compatible format (e.g., sdf, cdx, mol, mol2). The molecule selection option then appears (Figure 4C), and if the molecule is new to the system, it appears with the option \"New.\" If this option is selected, a blank register page with four subdivisions is shown: Basic Data, Extra Data, Botanical Data, and Geographical Data. However, if the molecule already exists in the system, another box with the drawn molecule appears, and after choosing this box, the register page for the structure containing all previously registered information appears (Figure 5).\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 5. SistematX data management interface\n\n\n\nImmediately after the New option is selected, it appears in the register page, including some basic data generated by the MarvinJS API: SMILES, IUPAC name, InChI, InChIKey, NOX, exact mass, and relative mass. The class and skeleton of metabolites must be chosen to register a new structure in the system; this is a sequential process, and thus, the class must first be selected to make the skeleton structure option available. Classes and skeletons not already registered in the system can be registered via the Class and Skeleton tab on the top of the screen. Common name and CAS are optional registration options.\nThe Extra Data subdivision allows insertion of all structural spectroscopic information, such as 1H- and 13C-NMR (Nuclear Magnetic Resonance) and mass spectra. In addition, it is possible to find 2D NMR information through the HMBC (Heteronuclear Multiple Bond Correlation) technique to establish the relationship between 13C- and 1H-shifts. In the NMR data, the administrator must first select the deuterated solvent used in the spectroscopic studies. If this information does not exist in the options, it can be registered by selecting the Solvent tab on the top of the screen. After the structure appears with an atomic numeration assigned by the MarvinJS API (identified as Atom), it is always necessary to verify these numbers for the biogenetic numeration (identified as \u201cBiogenetic\u201d) and finally to add the chemical shift value for each atom. For 1H-NMR, it is also possible to register H\u2013H coupling constants (J in Hz). Mass spectrum information, molecular mass, and intensity of fragments can also be being registered.\nTo create a new registry for botanical data of a certain species, the following information is required: journal, year, volume, first page, and last page information. Journal, genus, and species are drop-down lists, and these last two must be filled in this order. If any information relevant to these three fields does not exist, it can be inserted by clicking in \"Journal\" and \"References\" and writing the journal name (autocomplete tab) or by clicking in \"Botanical Data,\" where taxonomic data are registered. Biological activity data appear as activity, system, system type, value, journal, year, and pages. Drop-down lists can be filled with previously registered data or by entering new data in the Biological Activity tab.\nFinally, the administrator can register geographical data using the Google maps API. For any structure, it is possible to register latitude and longitude of the corresponding species studied. The genus and species boxes are filled in the same manner described above. Longitude and latitude can be inserted in two ways, first by writing the coordinates in the spaces; once registered, they appear on the world map as a red indicator showing the location. Another method is to select the place by clicking on the red pin on the map; when the pin is released, the values of longitude and latitude appear in the respective boxes.\nCurrently, our database has more than 1300 sesquiterpene lactones and 850 flavonoids and chalcones, with more than 4000 botanical occurrences of the Asteraceae family; approximately 500 alkaloids, which represents more than 750 botanical occurrences of the Apocynaceae family; and several terpenes and alkaloids of Annonaceae, Apocynaceae, and Asteraceae that correspond to more than 800 botanical occurrences.\n\nMaterial and methods \nImplementation \nSistematX was developed in the Java programming language (version 8 or higher), using JSP (JavaServer Pages) technology version 2.1 or higher and MySQL database version 5.5.46-0 for Linux[31] to maintain the system data. An initial version of SistematX web was published in the proceedings of MOL2NET in 2015 to demonstrate its functionalities to the academic community and, with feedback, to improve old functionalities and add new tools.[32] In the current version, including automatically relative mass, exact mass, CAS number, and InChI (in the previous version only the InChIKey was available) for each compound. The CAS number if not generated automatically but can be added manually, and geographical localization of species where the compound was isolated is now available. It is also possible to perform a structural search by similarity index.\nThe system uses JSP to create pages with specific information for each molecule and dynamic page changes by clicking on certain buttons. Intermediary pages are used to recover information from the database and insert it in the JSP. The system creates a DAO (data access object) to organize the data on intermediary pages, working like a bridge from the DAO to JSP. For each database table needed in a request, a DAO is created specifically for that table, containing all attributes from the database table.\nBootstrap version 3.3.5[33] \u2014 a graphical interface web framework that uses HTML (HyperText Markup Language), CSS (Cascading Style Sheets) and JavaScript \u2014 is employed for a better appearance on HTML pages, adapting some functions and styles to aid in the website design. The system also uses jQuery framework version 2.1.4[34], a powerful JavaScript tool to manipulate HTML DOM (document object model) events. Another framework, jQuery AutoComplete version 1.2.18[35], is used to generate autocomplete inputs.\nSeveral APIs are used in the SistematX implementation (Table 1). MarvinJS version 15.7.20, from ChemAxon[36], is the drawing API that is integrated with ChemAxon JChem WebService[37], an external online service that transforms the drawn structure into SMILES code, after which a JChem API function turns it into a binary fingerprint. This fingerprint is used to search for molecules \u2014 using substructure or similarity \u2014 in the database via their structure. The molecule converted to the fingerprint is used as a fragment in the search, comparing it to the database molecule\u2019s fingerprints to determine if it exists inside as a fragment. This API utilizes HTML, CSS, and JavaScript to perform the transformations. We use the standardizer API of Chemaxon JChem Web Services to not only standardize nitro groups but also aromatize the structure search results as correctly as possible, making a query and the database with similar representation.\n\n\n\n\n\n\n\nTable 1. Summary of the Application Programming Interface (API) implemented by SistematX\n\n\nAPI\n\nDescription\n\nEngine\n\n\n1. Structure\n\n\n2D drawing\n\nAllows drawing and visualization of chemical structures\n\nChemAxon\n\n\n3D generator\n\nUses 2D drawing to generate a 3D representation of the molecule\n\nChemAxon\n\n\n3D\n\nGraphical visualization of 3D molecules with JavaScript\n\nChemDoodle\n\n\n2. Compound Identification\n\n\nSMILES\n\nSimplified Molecular Input Line Entry System\n\nChemAxon\n\n\nIUPAC\n\nIUPAC Nomenclature\n\nChemAxon\n\n\nInChI\n\nIUPAC International Chemical Identifier\n\nChemAxon\n\n\nInChIKey\n\nInChIKey is a compact format of the InChI code\n\nChemAxon\n\n\nCAS\n\nChemical Abstracts Service Registry Number\n\nChemAxon\n\n\n3. Compound Data\n\n\nNOX\n\nOxidation number (NOX) of an organic compound\n\nChemAxon\n\n\nExact Mass\n\nUses the mass of the most abundant isotope of each element\n\nChemAxon\n\n\nRelative Mass\n\nUses the average atomic mass of each element\n\nChemAxon\n\n\n4. Geographic Data\n\n\nLatitude\n\nCan be inserted by the administrator or appears by clicking in the world map\n\nGoogle Inc.\n\n\nLongitude\n\nCan be inserted by the administrator or appears by clicking in the world map\n\nGoogle Inc.\n\n\nApproximate\n\nUsing the latitude and longitude, appears an an approximate location of the species\n\nGoogle Inc.\n\n\nVisualization\n\nUses the world map to possible to visualize the localization of the species\n\nGoogle Inc.\n\n\n\nMarvinJS is used to create a 3D structure from a 2D molecule, and ChemDoodle Web Components[38] shows this view as JavaScript in the browser. This API is able to generate several 2D and 3D molecule graphical views using pure JavaScript.\nThe ChemAxon API allows the visualization of compound characteristics. SistematX displays general nomenclature information such as the common name, SMILES code, IUPAC name, InChI, InChIKey, CAS registry number and properties such as oxidation number (NOX), exact mass, and relative mass.\nIn addition, Google Maps, from Google Inc.[39], an API used to prepare maps and locations, is used in the system to show the registered metabolite location on the world map. The API draws the map and receives locations from the database, which are two variables representing the latitude and longitude. A registered molecule may have multiple locations and a species linked to it. Using a JavaScript function, it graphically sets the locations on the map. When registering by clicking on the map, it sets a marker at the mouse location and adds a line to the coordinates list below the map for each marker on the map, allowing it automatically to change the position when the value in the latitude or longitude boxes is changed. The coordinates are also transformed into an approximate address, using reverse geocode, a function from the Google Maps API.\n\nConclusions \nIn this article, we introduce a web interface for managing a secondary metabolite database, which is multiplatform and able to be consulted via the internet and managed from any accredited computer. The interface provides a wealth of useful information for the scientific community about natural products, highlighting the location of species from which the compounds were isolated. Several new functionalities will be added continuously, such as new calculated molecular descriptors, tools to aid structural elucidation using experimental and calculated NMR data, and support for downloading several structures in one file using a batch transfer data option.\nSistemat X is freely accessible on the homepage http:\/\/sistematx.ufpb.br.\n\nAcknowledgments \nWe would like to thank the Student Agreement Program of Graduate\u2014PEC-PG of CNPq, Brazil and the Brazilian National Council for Scientific and Technological Development (CNPq)\u2014Award Number 461093\/2014-6.\n\nAuthor contributions \nS.Y.K.d.O.S., R.P.O.C. and M.T.S. carried out the design, programing and drafted the manuscript. C.H.-A., T.B.O., F.B.D.C., L.S. and R.P.R. carried out the design, tests and drafted the manuscript. C.H.-A., L.S., F.B.D.C. and M.T.S. revised the manuscript critically.\n\nConflicts of interest \nThe authors declare no conflict of interest.\n\nReferences \n\n\n\u2191 Blunt, J.W.; Munro, M.H.G. (2014). \"22. Is There an Ideal Database for Natural Products Research?\". In Osbourn, A.; Goss, R.J.; Carter, G.T.. Natural Products: Discourse, Diversity, and Design. John Wiley & Sons, Inc. pp. 413\u2013431. doi:10.1002\/9781118794623.ch22. ISBN 9781118794623.   \n\n\u2191 2.0 2.1 Harvey, A.L.; Edrada-Ebel, R.; Quinn, R.J. (2015). \"The re-emergence of natural products for drug discovery in the genomics era\". Nature Reviews Drug Discovery 14 (2): 111\u201329. doi:10.1038\/nrd4510. PMID 25614221.   \n\n\u2191 Corley, D.G.; Durley, R.C. (1994). \"Strategies for Database Dereplication of Natural Products\". Journal of Natural Products 57 (11): 1484\u20131490. doi:10.1021\/np50113a002.   \n\n\u2191 Oliveira, T.; Chagas-Paula, D.; Rosa, A. et al. (2013). \"Temporal characteristics of a natural products in-house database\". Planta Medica 79: 1113\u20131114. doi:10.1055\/s-0033-1351852.   \n\n\u2191 Pence, H.E.; Williams, A. (2010). \"ChemSpider: An Online Chemical Information Resource\". Journal of Chemical Education 87 (11): 1123\u20131124. doi:10.1021\/ed100697w.   \n\n\u2191 Bolton, E.E.; Wang, Y.; Thiessen, P.A.; Bryant, S.H. (2008). \"Chapter 12 - PubChem: Integrated Platform of Small Molecules and Biological Activities\". In Wheeler, R.A.; Spellmeyer, D.C.. Annual Reports in Computational Chemistry. 4. Elsevier Ltd. pp. 217-241. doi:10.1016\/S1574-1400(08)00012-1. ISBN 9780444532503.   \n\n\u2191 Degtyarenko, K.; de Matos, P.; Ennis, M. et al. (2008). \"ChEBI: A database and ontology for chemical entities of biological interest\". Nucleic Acids Research 36 (DB1): D344-50. doi:10.1093\/nar\/gkm791. PMC PMC2238832. PMID 17932057. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2238832 .   \n\n\u2191 Irwin, J.J.; Shoichet, B.K. (2005). \"ZINC: A free database of commercially available compounds for virtual screening\". Journal of Chemical Information and Modeling 45 (1): 177\u201382. doi:10.1021\/ci049714+. PMC PMC1360656. PMID 15667143. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC1360656 .   \n\n\u2191 9.0 9.1 Williams, R.B.; O'Neil-Johnson, M.; Williams, A.J. et al. (2015). \"Dereplication of natural products using minimal NMR data inputs\". Organic & Biomolecular Chemistry 13 (39): 9957\u201362. doi:10.1039\/c5ob01713k. PMID 26381222.   \n\n\u2191 Eugster, P.J.; Boccard, J.; Debrus, B. et al. (2014). \"Retention time prediction for dereplication of natural products (CxHyOz) in LC-MS metabolite profiling\". Phytochemistry 108: 196\u2013207. doi:10.1016\/j.phytochem.2014.10.005. PMID 25457501.   \n\n\u2191 \"Dictionary of Natural Products\". CRC Press. http:\/\/dnp.chemnetbase.com\/ . Retrieved 04 March 2017 .   \n\n\u2191 Graham, J.G.; Farnsworth, N.R. (2010). \"The NAPRALERT database as an aid for discovery of novel bioactive compounds\". Comprehensive Natural Products II: Chemistry and Biology. 3. Elsevier Ltd. pp. 81\u201394. ISBN 9780080453828.   \n\n\u2191 Dabb, S.; Blunt, J.; Munro, M. (2014). \"MarinLit: Database and essential tools for the marine natural products community\". Proceedings of the 248th National Meeting of the American-Chemical-Society (ACS) 248.   \n\n\u2191 Banerjee, P.; Erehman, J.; Gohlke, B.O. et al. (2015). \"Super Natural II:A database of natural products\". Nucleic Acids Research 43 (DB1): D935\u20139. doi:10.1093\/nar\/gku886. PMC PMC4384003. PMID 25300487. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4384003 .   \n\n\u2191 Drewry, D.H.; Macarron, R. (2010). \"Enhancements of screening collections to address areas of unmet medical need: An industry perspective\". Current Opinions in Chemical Biology 14 (3): 289\u201398. doi:10.1016\/j.cbpa.2010.03.024. PMID 20413343.   \n\n\u2191 Valli, M.; dos Santos, R.N.; Figueira, L.D. et al. (2013). \"Development of a natural products database from the biodiversity of Brazil\". Journal of Natural Products 76 (3): 439-44. doi:10.1021\/np3006875. PMID 23330984.   \n\n\u2191 Hatherley, R.; Brown, D.K.; Musyoka, T.M. et al. (2015). \"SANCDB: A South African natural compound database\". Journal of Cheminformatics 7: 29. doi:10.1186\/s13321-015-0080-8. PMC PMC4471313. PMID 26097510. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4471313 .   \n\n\u2191 Kim, S.K.; Nam, S.; Jang, H. et al. (2015). \"TM-MC: A database of medicinal materials and chemical compounds in Northeast Asian traditional medicine\". BMC Complementary and Alternative Medicine 15: 218. doi:10.1186\/s12906-015-0758-5. PMC PMC4495939. PMID 26156871. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4495939 .   \n\n\u2191 Chen, C.Y. (2011). \"TCM Database@Taiwan: The world's largest traditional Chinese medicine database for drug screening in silico\". PLoS One 6 (1): e15939. doi:10.1371\/journal.pone.0015939. PMC PMC3017089. PMID 21253603. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3017089 .   \n\n\u2191 Ntie-Kang, F.; Telukunta, K.K.; D\u00f6ring, K. et al. (2017). \"NANPDB: A Resource for Natural Products from Northern African Sources\". Journal of Natural Products 80 (7): 2067-2076. doi:10.1021\/acs.jnatprod.7b00283. PMID 28641017.   \n\n\u2191 Xue, R.; Fang, Z.; Zhang, M. et al. (2013). \"TCMID: Traditional Chinese Medicine integrative database for herb molecular mechanism analysis\". Nucleic Acids Research 41 (DB1): D1089-95. doi:10.1093\/nar\/gks1100. PMC PMC3531123. PMID 23203875. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3531123 .   \n\n\u2191 Afendi, F.M.; Okada, T.; Yamazaki, M. et al. (2012). \"KNApSAcK family databases: Integrated metabolite-plant species databases for multifaceted plant research\". Plsnt & Cell Physiology 53 (2): e1. doi:10.1093\/pcp\/pcr165. PMID 22123792.   \n\n\u2191 Tung, C.W.; Lin, Y.C.; Chang, H.S. et al. (2014). \"TIPdb-3D: The three-dimensional structure database of phytochemicals from Taiwan indigenous plants\". Database 2014: bau055. doi:10.1093\/database\/bau055. PMC PMC4057645. PMID 24930145. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4057645 .   \n\n\u2191 \"AsterDB\". AsterBioChem. http:\/\/www.asterbiochem.org\/asterdb . Retrieved 04 March 2017 .   \n\n\u2191 Sampaio, B.L.; Edrada-Ebel, R.; Da Costa, F.B. (2016). \"Effect of the environment on the secondary metabolic profile of Tithonia diversifolia: A model for environmental metabolomics of plants\". Scientific Reports 6: 29265. doi:10.1038\/srep29265. PMC PMC4935878. PMID 27383265. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4935878 .   \n\n\u2191 Schmidt, T.J.; Rzeppa, S.; Kaiser, M.; Brun, R. (2012). \"Larrea tridentata\u2014Absolute configuration of its epoxylignans and investigations on its antiprotozoal activity\". Phytochemistry Letters 5 (3): 632-638. doi:10.1016\/j.phytol.2012.06.011.   \n\n\u2191 Gaquerel, E.; Kuhl, C.; Neumann, S. (2013). \"Computational annotation of plant metabolomics profiles via a novel network-assisted approach\". Metabolomics 9 (4): 904\u2013918. doi:10.1007\/s11306-013-0504-2.   \n\n\u2191 Hendrickson, J.B.; Cram, D.J.; Hammond, G.S. (1970). Organic Chemistry. McGraw-Hill. ISBN 070281505.   \n\n\u2191 Gottlieb, O. (1989). \"The role of oxygen in phytochemical evolution towards diversity\". Phytochemistry 28: 2545\u20132558.   \n\n\u2191 Z\u00fcst, T.; Heichinger, C.; Grossniklaus, U. et al. (2012). \"Natural enemies drive geographic variation in plant defenses\". Science 338 (6103): 116\u20139. doi:10.1126\/science.1226397. PMID 23042895.   \n\n\u2191 \"Download MySQL Community Server\". Oracle Corporation. https:\/\/dev.mysql.com\/downloads\/mysql\/ . Retrieved 31 January 2017 .   \n\n\u2191 Scotti, M.T.; Da Silva Junior, R.O.; De Oliveira Santos, S.Y.K. et al. (2015). [https:\/\/sciforum.net\/conference\/MOL2NET-1\/paper\/3348\/download\/pdf \"SISTEMAT X - A Web Tool to Manage Databases of\nSecondary Metabolites\"]. Proceedings of MOL2NET, International Conference on Multidisciplinary Sciences 1 (Section F): 1\u20136. https:\/\/sciforum.net\/conference\/MOL2NET-1\/paper\/3348\/download\/pdf .   \n\n\u2191 \"Bootstrap\". Bootstrap Development Team. http:\/\/getbootstrap.com\/ . Retrieved 10 January 2017 .   \n\n\u2191 \"jQuery - Write less, do more\". The jQuery Foundation. http:\/\/jquery.com\/ . Retrieved 17 January 2017 .   \n\n\u2191 \"jQuery - User interface\". The jQuery Foundation. http:\/\/jqueryui.com\/autocomplete\/ . Retrieved 20 January 2017 .   \n\n\u2191 \"Marvin JS\". ChemAxon Ltd. https:\/\/chemaxon.com\/products\/marvin-js . Retrieved 24 January 2017 .   \n\n\u2191 \"JChem Engines\". ChemAxon Ltd. https:\/\/chemaxon.com\/products\/jchem-engines . Retrieved 24 January 2017 .   \n\n\u2191 \"ChemDoodle Web Components\". iChemLabs, LLC. https:\/\/web.chemdoodle.com\/ . Retrieved 24 January 2017 .   \n\n\u2191 \"Google Maps APIs\". Google Inc. https:\/\/developers.google.com\/maps\/ . Retrieved 13 January 2017 .   \n\n\nNotes \nThis presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added.\n\n\n\n\n\n\nSource: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\">https:\/\/www.limswiki.org\/index.php\/Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites<\/a>\n\t\t\t\t\tCategories: LIMSwiki journal articles (added in 2018)LIMSwiki journal articles (all)LIMSwiki journal articles on chemical informaticsLIMSwiki journal articles on software\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\n\t\t\n\t\t\tNavigation menu\n\t\t\t\t\t\n\t\t\tViews\n\n\t\t\t\n\t\t\t\t\n\t\t\t\tJournal\n\t\t\t\tDiscussion\n\t\t\t\tView source\n\t\t\t\tHistory\n\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\t\n\t\t\t\tPersonal tools\n\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\t\t\tLog in\n\t\t\t\t\t\t\t\t\t\t\t\t\tRequest account\n\t\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\t\n\t\tNavigation\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tMain page\n\t\t\t\t\t\t\t\t\t\t\tRecent changes\n\t\t\t\t\t\t\t\t\t\t\tRandom page\n\t\t\t\t\t\t\t\t\t\t\tHelp\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tSearch\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t \n\t\t\t\t\t\t\n\t\t\t\t\n\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tTools\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tWhat links here\n\t\t\t\t\t\t\t\t\t\t\tRelated changes\n\t\t\t\t\t\t\t\t\t\t\tSpecial pages\n\t\t\t\t\t\t\t\t\t\t\tPermanent link\n\t\t\t\t\t\t\t\t\t\t\tPage information\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\tPrint\/export\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tCreate a book\n\t\t\t\t\t\t\t\t\t\t\tDownload as PDF\n\t\t\t\t\t\t\t\t\t\t\tDownload as Plain text\n\t\t\t\t\t\t\t\t\t\t\tPrintable version\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\n\t\tSponsors\n\t\t\n\t\t\t \r\n\n\t\r\n\n\t\r\n\n\t\r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n\t\n\t\r\n\n\t\r\n\n\t\r\n\n\t\r\n\t\t\n\t\t\n\t\t\t\n\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t This page was last modified on 8 January 2018, at 21:23.\n\t\t\t\t\t\t\t\t\tThis page has been accessed 971 times.\n\t\t\t\t\t\t\t\t\tContent is available under a Creative Commons Attribution-ShareAlike 4.0 International License unless otherwise noted.\n\t\t\t\t\t\t\t\t\tPrivacy policy\n\t\t\t\t\t\t\t\t\tAbout LIMSWiki\n\t\t\t\t\t\t\t\t\tDisclaimers\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\t\n\n","478e0fc6bdbd74f64773b750f2c9edcc_html":"<body class=\"mediawiki ltr sitedir-ltr ns-206 ns-subject page-Journal_SistematX_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites skin-monobook action-view\">\n<div id=\"rdp-ebb-globalWrapper\">\n\t\t<div id=\"rdp-ebb-column-content\">\n\t\t\t<div id=\"rdp-ebb-content\" class=\"mw-body\" role=\"main\">\n\t\t\t\t<a id=\"rdp-ebb-top\"><\/a>\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t<h1 id=\"rdp-ebb-firstHeading\" class=\"firstHeading\" lang=\"en\">Journal:SistematX, an online web-based cheminformatics tool for data management of secondary metabolites<\/h1>\n\t\t\t\t\n\t\t\t\t<div id=\"rdp-ebb-bodyContent\" class=\"mw-body-content\">\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\n\t\t\t\t\t<!-- start content -->\n\t\t\t\t\t<div id=\"rdp-ebb-mw-content-text\" lang=\"en\" dir=\"ltr\" class=\"mw-content-ltr\">\n\n\n<h2><span class=\"mw-headline\" id=\"Abstract\">Abstract<\/span><\/h2>\n<p>The traditional work of a natural products researcher consists in large part of time-consuming experimental work, collecting biota to prepare, extracts to analyze, and innovative metabolites to identify. However, along this long scientific path, much information is lost or restricted to a specific niche. The large amounts of data already produced and the science of metabolomics reveal new questions: Are these compounds known or new? How fast can this information be obtained? To answer these and other relevant questions, an appropriate procedure to correctly store <a href=\"https:\/\/www.limswiki.org\/index.php\/Information\" title=\"Information\" target=\"_blank\" class=\"wiki-link\" data-key=\"6300a14d9c2776dcca0999b5ed940e7d\">information<\/a> on the data retrieved from the discovered metabolites is necessary. The SistematX (<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/sistematx.ufpb.br\" target=\"_blank\">http:\/\/sistematx.ufpb.br<\/a>) interface is implemented considering the following aspects: (a) the ability to search by structure, <a href=\"https:\/\/www.limswiki.org\/index.php\/SMILES_(language)\" title=\"SMILES (language)\" target=\"_blank\" class=\"wiki-link\" data-key=\"56a933da23aa353feb9b8449d7c57f9f\">SMILES<\/a> (Simplified Molecular-Input Line-Entry System) code, compound name, and species; (b) the ability to save chemical structures found by searching; (c) the ability to display compound data results, including important characteristics for natural products chemistry; and (d) the user's ability to find specific information for taxonomic rank (from family to species) of the plant from which the compound was isolated, the searched-for molecule, and the bibliographic reference and Global Positioning System (GPS) coordinates. The SistematX homepage allows the user to log into the data management area using a login name and password and gain access to administration pages. In this article, we introduce a modern and innovative web interface for the management of a secondary metabolite database. With its multi-platform design, it is able to be properly consulted via the internet and managed from any accredited computer. The interface provided by SistematX contains a wealth of useful information for the scientific community about natural products, highlighting the locations of species from which compounds are isolated.\n<\/p><p><b>Keywords<\/b>: SistematX, secondary metabolites, data management, online web-based tool\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Introduction\">Introduction<\/span><\/h2>\n<p>The traditional work of a natural products researcher can be summarized as the collection of biological samples, preparation of extracts for biological screening or bioassay-guided fractionation, and isolation and purification of (bioactive or not) compounds. However, the first question that may arise is the following: are these compounds known or new? In addition, metabolomics studies have introduced a new question: how fast can this information be obtained?<sup id=\"rdp-ebb-cite_ref-BluntIsThere14_1-0\" class=\"reference\"><a href=\"#cite_note-BluntIsThere14-1\" rel=\"external_link\">[1]<\/a><\/sup>\n<\/p><p>The stage of dereplication, a process known as the rapid characterization of previously known compounds in mixtures without their prior purification, has become a strategically important area for natural products research involved in screening programs in several commercial and non-commercial databases.<sup id=\"rdp-ebb-cite_ref-HarveyTheRe15_2-0\" class=\"reference\"><a href=\"#cite_note-HarveyTheRe15-2\" rel=\"external_link\">[2]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-CorleyStrategies94_3-0\" class=\"reference\"><a href=\"#cite_note-CorleyStrategies94-3\" rel=\"external_link\">[3]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-OliveiraTemporal13_4-0\" class=\"reference\"><a href=\"#cite_note-OliveiraTemporal13-4\" rel=\"external_link\">[4]<\/a><\/sup> These databases can be searched with minimal information, such as structural chemical and biological data from compounds; however, dereplication now requires additional information, such as biogeographical and taxonomic information, or the presence of a certain compound (new or known) in other individuals of the same species, genus, subfamily, and family. This information can also help to reduce the number of hits during chemical identification by dereplication.\n<\/p><p>Large structure-based data collections, such as ChemSpider<sup id=\"rdp-ebb-cite_ref-PenceChemSpider10_5-0\" class=\"reference\"><a href=\"#cite_note-PenceChemSpider10-5\" rel=\"external_link\">[5]<\/a><\/sup>, PubChem<sup id=\"rdp-ebb-cite_ref-BoltonPubChem08_6-0\" class=\"reference\"><a href=\"#cite_note-BoltonPubChem08-6\" rel=\"external_link\">[6]<\/a><\/sup>, ChEBI<sup id=\"rdp-ebb-cite_ref-DegtyarenkoChEBI08_7-0\" class=\"reference\"><a href=\"#cite_note-DegtyarenkoChEBI08-7\" rel=\"external_link\">[7]<\/a><\/sup>, and ZINC<sup id=\"rdp-ebb-cite_ref-IrwinZINC05_8-0\" class=\"reference\"><a href=\"#cite_note-IrwinZINC05-8\" rel=\"external_link\">[8]<\/a><\/sup> can be used for this purpose.<sup id=\"rdp-ebb-cite_ref-WilliamsDerep15_9-0\" class=\"reference\"><a href=\"#cite_note-WilliamsDerep15-9\" rel=\"external_link\">[9]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-EugsterRetention14_10-0\" class=\"reference\"><a href=\"#cite_note-EugsterRetention14-10\" rel=\"external_link\">[10]<\/a><\/sup> However, these databases are not specialized in secondary metabolite information that is valuable to the natural products researchers, for example, botanical occurrence and geographical localization. For this reason, a number of specialized natural products databases were developed that are commercially or freely available and only contain restricted information, for example, the Dictionary of Natural Products (DNP)<sup id=\"rdp-ebb-cite_ref-CRCDictionary_11-0\" class=\"reference\"><a href=\"#cite_note-CRCDictionary-11\" rel=\"external_link\">[11]<\/a><\/sup>, NAPRALERT<sup id=\"rdp-ebb-cite_ref-GrahamTheNAPRALERT10_12-0\" class=\"reference\"><a href=\"#cite_note-GrahamTheNAPRALERT10-12\" rel=\"external_link\">[12]<\/a><\/sup>, Marinlit for marine natural products<sup id=\"rdp-ebb-cite_ref-DabbMarinLit_13-0\" class=\"reference\"><a href=\"#cite_note-DabbMarinLit-13\" rel=\"external_link\">[13]<\/a><\/sup>, and Antibase for microorganisms and higher fungi materials. Nevertheless, none of these provide structural collections in a format that can be rapidly integrated into software such as ACD\/Structure Elucidator and others.<sup id=\"rdp-ebb-cite_ref-WilliamsDerep15_9-1\" class=\"reference\"><a href=\"#cite_note-WilliamsDerep15-9\" rel=\"external_link\">[9]<\/a><\/sup>\n<\/p><p>Other natural products databases provide natural products extracted from various resources and contain various associated information such as toxicity prediction, but so far, little or nothing is known about these resources, for example, SUPER NATURAL II.<sup id=\"rdp-ebb-cite_ref-BanerjeeSuper15_14-0\" class=\"reference\"><a href=\"#cite_note-BanerjeeSuper15-14\" rel=\"external_link\">[14]<\/a><\/sup> Natural products databases exhibit a huge range of structural complexity and thus are expected to contribute to the ability of such databases to provide positive hits.<sup id=\"rdp-ebb-cite_ref-HarveyTheRe15_2-1\" class=\"reference\"><a href=\"#cite_note-HarveyTheRe15-2\" rel=\"external_link\">[2]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-DrewryEnhance10_15-0\" class=\"reference\"><a href=\"#cite_note-DrewryEnhance10-15\" rel=\"external_link\">[15]<\/a><\/sup> These structures are available in regional databases, for example NeBBEDB<sup id=\"rdp-ebb-cite_ref-ValliDevelop13_16-0\" class=\"reference\"><a href=\"#cite_note-ValliDevelop13-16\" rel=\"external_link\">[16]<\/a><\/sup>, SANCDB<sup id=\"rdp-ebb-cite_ref-HatherleySANCDB15_17-0\" class=\"reference\"><a href=\"#cite_note-HatherleySANCDB15-17\" rel=\"external_link\">[17]<\/a><\/sup>, TM-CM<sup id=\"rdp-ebb-cite_ref-KimTM-MC15_18-0\" class=\"reference\"><a href=\"#cite_note-KimTM-MC15-18\" rel=\"external_link\">[18]<\/a><\/sup>, TCM-Database@Taiwan<sup id=\"rdp-ebb-cite_ref-ChenTCM11_19-0\" class=\"reference\"><a href=\"#cite_note-ChenTCM11-19\" rel=\"external_link\">[19]<\/a><\/sup>, NANPDB<sup id=\"rdp-ebb-cite_ref-NtieNANPDB17_20-0\" class=\"reference\"><a href=\"#cite_note-NtieNANPDB17-20\" rel=\"external_link\">[20]<\/a><\/sup>, and TCMID.<sup id=\"rdp-ebb-cite_ref-XueTCMID13_21-0\" class=\"reference\"><a href=\"#cite_note-XueTCMID13-21\" rel=\"external_link\">[21]<\/a><\/sup> Many have been used in virtual screening research studies. In addition to the database information described above that uses two-dimensional (2D) structures, several databases have selected methods and tools for generating three-dimensional (3D) structures of small organic molecules, often for use in structure-based drug design.\n<\/p><p>In addition, databases of natural products with a focus on metabolomic studies with relationships between species-metabolites include the KNApSAcK Family<sup id=\"rdp-ebb-cite_ref-AfendiKNApSAcK12_22-0\" class=\"reference\"><a href=\"#cite_note-AfendiKNApSAcK12-22\" rel=\"external_link\">[22]<\/a><\/sup>, TIPdb-3D<sup id=\"rdp-ebb-cite_ref-TungTIPdb14_23-0\" class=\"reference\"><a href=\"#cite_note-TungTIPdb14-23\" rel=\"external_link\">[23]<\/a><\/sup>, and AsterDB<sup id=\"rdp-ebb-cite_ref-ABCAsterDB_24-0\" class=\"reference\"><a href=\"#cite_note-ABCAsterDB-24\" rel=\"external_link\">[24]<\/a><\/sup>, which enable searches for chemical structures by plant species names and other taxonomic information. Nevertheless, some data are still lacking for the purpose of exact dereplication. Information such as exact mass and geographic data can be very important for this type of study.<sup id=\"rdp-ebb-cite_ref-SampaioEffect16_25-0\" class=\"reference\"><a href=\"#cite_note-SampaioEffect16-25\" rel=\"external_link\">[25]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-SchmidtLarrea12_26-0\" class=\"reference\"><a href=\"#cite_note-SchmidtLarrea12-26\" rel=\"external_link\">[26]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-GaquerelComput13_27-0\" class=\"reference\"><a href=\"#cite_note-GaquerelComput13-27\" rel=\"external_link\">[27]<\/a><\/sup>\n<\/p><p>It is not enough simply to focus on the information contained in a database. A clean and user-friendly interface, fast search, and consistency between currently available operating systems (Microsoft Windows, Mac, and Linux) can be just as important. For this purpose, the SistematX software was developed to provide the abovementioned information for chemosystematics studies, dereplication, and botanical correlations.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Results_and_discussion\">Results and discussion<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"Utility_and_discussion\">Utility and discussion<\/span><\/h3>\n<p>The SistematX homepage is shown in Figure 1A. After the user enters the website (<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/sistematx.ufpb.br\" target=\"_blank\">http:\/\/sistematx.ufpb.br<\/a>), the \u201cStructure search\u201d option is seen with the MarvinJS API (Application Programming Interface) at the top of the screen. Another three search options can be exhibited in the interface. The initial screen of the system also shows the SMILES (Simplified Molecular-Input Line-Entry System) code (Figure 1B), compound name (Figure 1C) and plant species search modes (Figure 1D).\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig1_Scotti_Molecules2018_23-1.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"a9b7450e6bdb8de295713ba2b53e8e7d\"><img alt=\"Fig1 Scotti Molecules2018 23-1.png\" src=\"https:\/\/www.limswiki.org\/images\/b\/b4\/Fig1_Scotti_Molecules2018_23-1.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 1.<\/b> SistematX homepage with different search options: <b>(A)<\/b> by structure; <b>(B)<\/b> by Simplified Molecular-Input Line-Entry System (SMILES); <b>(C)<\/b> by compound name; and <b>(D)<\/b> by plant species<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>In the first option, the user can perform the search using the drawn full structure or molecular skeleton, fragments, or substructures, which is important in cases when the user only knows a structural characteristic of the structure such as functional groups or when studies require structural similarity, structural groups, or families of compounds. It is possible to use the similarity search option that is currently available in SistematX. The search results page shows all compounds that correspond to the value above the cut off provided by the user in the decreasing order of similarity, showing the similarity values on the top. A substructure and similarity search is performed using a hashed fingerprint. Special molecular features are present in the query (e.g., stereochemistry, charge), only those targets match that also contain the feature. However, if a feature is missing from the query, it is not required to be missing.\n<\/p><p>In addition, it is possible to search by SMILES code, a chemical notation system capable of representing even the most complex organic compounds using a simple grammar that is very well known to organic chemistry researchers; for this reason we add this option separately from the structure search using the MarvinJS API, being friendly for one just to copy and paste the SMILES code (Figure 1B); by common (usual) name or IUPAC (International Union of Pure and Applied Chemistry) name (or part of one of these); and by species, although in this option, it is necessary to first insert the name of the genus (which presents an autocompletion option). After being selected, the system presents all species available for the user to select for the search.\n<\/p><p>When performing a search, the mechanism generates a search results page (six results per page), using common names; if the compound does not have one, it shows the IUPAC name (Figure 2). The user can set the number of structure results per page. When a result is selected, the user has access to the data for that molecule, which are classified into six different groups (Figure 3).\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig2_Scotti_Molecules2018_23-1.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"d6ca163a727209f4dd805a995b6f7c41\"><img alt=\"Fig2 Scotti Molecules2018 23-1.png\" src=\"https:\/\/www.limswiki.org\/images\/5\/56\/Fig2_Scotti_Molecules2018_23-1.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 2.<\/b> SistematX results page<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p><a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig3_Scotti_Molecules2018_23-1.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"c1e26c25e04753e2d616ba59bb64d378\"><img alt=\"Fig3 Scotti Molecules2018 23-1.png\" src=\"https:\/\/www.limswiki.org\/images\/3\/37\/Fig3_Scotti_Molecules2018_23-1.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 3.<\/b> SistematX screen for molecular data<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>The first group of results that appears is related to the structural representation of the searched molecule. The 2D structure is observed in the interface; on top, this appears as the option to amplify. After this is clicked, the system displays the visualization of the molecule in 2D and 3D (ChemDoodle, iChemLabs, Piscataway, NJ, USA) and an additional option for saving the 2D or 3D structure in an MDL (Molecular Design Limited, San Ramon, CA, USA) Molfile. The second type of result exhibited by the systems associated with compound identification, such as common name, SMILES code, IUPAC name, InChI (IUPAC International Chemical Identifier, Research Triangle Park, NC, USA) code, InChIKey code and CAS (Chemical Abstracts Service, Columbus, OH, USA) number. Except for the common name, which is optional and registered by the administrator, all parameters are provided by the JChem API.\n<\/p><p>Compound data results include important characteristics for natural products chemistry. The class of secondary metabolite of the searched molecule and its skeleton provide information about its biosynthetic pathway and assists in chemosystematics and chemotaxonomic studies. Oxidation number (NOX), which is calculated based on the Hendrickson rules<sup id=\"rdp-ebb-cite_ref-HendricksonOrganic70_28-0\" class=\"reference\"><a href=\"#cite_note-HendricksonOrganic70-28\" rel=\"external_link\">[28]<\/a><\/sup>, has been fundamental in chemotaxonomy since Gottlieb related the oxidation grade of molecules to species evolution.<sup id=\"rdp-ebb-cite_ref-GottliebTheRole89_29-0\" class=\"reference\"><a href=\"#cite_note-GottliebTheRole89-29\" rel=\"external_link\">[29]<\/a><\/sup> Molecular mass is calculated using the most abundant isotope of each element (exact mass) and the average atomic mass of each element (relative mass); these data are important for users working on purification processes and for structural elucidation of molecules, due to the mass information, which is essential for determining the purity of secondary metabolites.\n<\/p><p>In the botanical data field, the user can find specific information such as the taxonomic rank (from family to species) of the plant from which the compound was isolated, the searched molecule, and the bibliographic reference, which includes journal name, volume, page, and year. Because many different species can biosynthesize the same molecule, there is one register per species. Meanwhile, the biological data exhibit results obtained in studies related to the biological activity of the searched molecule, the type of activity, system, units, activity value and bibliographic reference are available in this section.\n<\/p><p>Plant species have revealed clear genetic signals for local adaptation.<sup id=\"rdp-ebb-cite_ref-Z.C3.BCstNatural12_30-0\" class=\"reference\"><a href=\"#cite_note-Z.C3.BCstNatural12-30\" rel=\"external_link\">[30]<\/a><\/sup> One species can synthesize a secondary metabolite depending on its location, and there are observed variations in compound concentrations at different sites. Because geographical data is an important parameter in natural products research, SistematX shows geographical coordinates (latitude and longitude) for a searched molecule and an approximate location of the species from which this metabolite was isolated. Using the Google Maps API, the user can observe the species location on the world map.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Data_management\">Data management<\/span><\/h3>\n<p>On the SistematX homepage, the user can also log into the data management area using login name and password (Figure 4A) and from there access the administration pages to edit or register new molecules. Once the corresponding information has been accepted, the data management interface appears.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig4_Scotti_Molecules2018_23-1.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"fdb40d0d6d035e20984a41212fb43b67\"><img alt=\"Fig4 Scotti Molecules2018 23-1.png\" src=\"https:\/\/www.limswiki.org\/images\/1\/16\/Fig4_Scotti_Molecules2018_23-1.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 4.<\/b> SistematX creates new registers through an administrator: <b>(A)<\/b> login and password option; <b>(B)<\/b> structure view and <b>(C)<\/b> molecule selection<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>The first requirement to register a new molecule is to insert the structure in the MarvinJS API (Figure 4B). Several methods can be used to accomplish this step: drawing, copying SMILES code, or importing the molecule in a compatible format (e.g., sdf, cdx, mol, mol2). The molecule selection option then appears (Figure 4C), and if the molecule is new to the system, it appears with the option \"New.\" If this option is selected, a blank register page with four subdivisions is shown: Basic Data, Extra Data, Botanical Data, and Geographical Data. However, if the molecule already exists in the system, another box with the drawn molecule appears, and after choosing this box, the register page for the structure containing all previously registered information appears (Figure 5).\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig5_Scotti_Molecules2018_23-1.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"13e62d47a61ba4b3806b4009e1af7685\"><img alt=\"Fig5 Scotti Molecules2018 23-1.png\" src=\"https:\/\/www.limswiki.org\/images\/0\/05\/Fig5_Scotti_Molecules2018_23-1.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 5.<\/b> SistematX data management interface<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>Immediately after the New option is selected, it appears in the register page, including some basic data generated by the MarvinJS API: SMILES, IUPAC name, InChI, InChIKey, NOX, exact mass, and relative mass. The class and skeleton of metabolites must be chosen to register a new structure in the system; this is a sequential process, and thus, the class must first be selected to make the skeleton structure option available. Classes and skeletons not already registered in the system can be registered via the Class and Skeleton tab on the top of the screen. Common name and CAS are optional registration options.\n<\/p><p>The Extra Data subdivision allows insertion of all structural spectroscopic information, such as 1H- and 13C-NMR (Nuclear Magnetic Resonance) and mass spectra. In addition, it is possible to find 2D NMR information through the HMBC (Heteronuclear Multiple Bond Correlation) technique to establish the relationship between 13C- and 1H-shifts. In the NMR data, the administrator must first select the deuterated solvent used in the spectroscopic studies. If this information does not exist in the options, it can be registered by selecting the Solvent tab on the top of the screen. After the structure appears with an atomic numeration assigned by the MarvinJS API (identified as Atom), it is always necessary to verify these numbers for the biogenetic numeration (identified as \u201cBiogenetic\u201d) and finally to add the chemical shift value for each atom. For 1H-NMR, it is also possible to register H\u2013H coupling constants (J in Hz). Mass spectrum information, molecular mass, and intensity of fragments can also be being registered.\n<\/p><p>To create a new registry for botanical data of a certain species, the following information is required: journal, year, volume, first page, and last page information. Journal, genus, and species are drop-down lists, and these last two must be filled in this order. If any information relevant to these three fields does not exist, it can be inserted by clicking in \"Journal\" and \"References\" and writing the journal name (autocomplete tab) or by clicking in \"Botanical Data,\" where taxonomic data are registered. Biological activity data appear as activity, system, system type, value, journal, year, and pages. Drop-down lists can be filled with previously registered data or by entering new data in the Biological Activity tab.\n<\/p><p>Finally, the administrator can register geographical data using the Google maps API. For any structure, it is possible to register latitude and longitude of the corresponding species studied. The genus and species boxes are filled in the same manner described above. Longitude and latitude can be inserted in two ways, first by writing the coordinates in the spaces; once registered, they appear on the world map as a red indicator showing the location. Another method is to select the place by clicking on the red pin on the map; when the pin is released, the values of longitude and latitude appear in the respective boxes.\n<\/p><p>Currently, our database has more than 1300 sesquiterpene lactones and 850 flavonoids and chalcones, with more than 4000 botanical occurrences of the <i>Asteraceae<\/i> family; approximately 500 alkaloids, which represents more than 750 botanical occurrences of the <i>Apocynaceae<\/i> family; and several terpenes and alkaloids of <i>Annonaceae<\/i>, <i>Apocynaceae<\/i>, and <i>Asteraceae<\/i> that correspond to more than 800 botanical occurrences.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Material_and_methods\">Material and methods<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"Implementation\">Implementation<\/span><\/h3>\n<p>SistematX was developed in the Java programming language (version 8 or higher), using JSP (JavaServer Pages) technology version 2.1 or higher and <a href=\"https:\/\/www.limswiki.org\/index.php\/MySQL\" title=\"MySQL\" target=\"_blank\" class=\"wiki-link\" data-key=\"35005451bfcd508bce47c58e72260128\">MySQL<\/a> database version 5.5.46-0 for Linux<sup id=\"rdp-ebb-cite_ref-MySQLDownload_31-0\" class=\"reference\"><a href=\"#cite_note-MySQLDownload-31\" rel=\"external_link\">[31]<\/a><\/sup> to maintain the system data. An initial version of SistematX web was published in the proceedings of MOL2NET in 2015 to demonstrate its functionalities to the academic community and, with feedback, to improve old functionalities and add new tools.<sup id=\"rdp-ebb-cite_ref-ScottiSistematiX15_32-0\" class=\"reference\"><a href=\"#cite_note-ScottiSistematiX15-32\" rel=\"external_link\">[32]<\/a><\/sup> In the current version, including automatically relative mass, exact mass, CAS number, and InChI (in the previous version only the InChIKey was available) for each compound. The CAS number if not generated automatically but can be added manually, and geographical localization of species where the compound was isolated is now available. It is also possible to perform a structural search by similarity index.\n<\/p><p>The system uses JSP to create pages with specific information for each molecule and dynamic page changes by clicking on certain buttons. Intermediary pages are used to recover information from the database and insert it in the JSP. The system creates a DAO (data access object) to organize the data on intermediary pages, working like a bridge from the DAO to JSP. For each database table needed in a request, a DAO is created specifically for that table, containing all attributes from the database table.\n<\/p><p>Bootstrap version 3.3.5<sup id=\"rdp-ebb-cite_ref-Bootstrap_33-0\" class=\"reference\"><a href=\"#cite_note-Bootstrap-33\" rel=\"external_link\">[33]<\/a><\/sup> \u2014 a graphical interface web framework that uses HTML (HyperText Markup Language), CSS (Cascading Style Sheets) and JavaScript \u2014 is employed for a better appearance on HTML pages, adapting some functions and styles to aid in the website design. The system also uses jQuery framework version 2.1.4<sup id=\"rdp-ebb-cite_ref-jQuery_34-0\" class=\"reference\"><a href=\"#cite_note-jQuery-34\" rel=\"external_link\">[34]<\/a><\/sup>, a powerful JavaScript tool to manipulate HTML DOM (document object model) events. Another framework, jQuery AutoComplete version 1.2.18<sup id=\"rdp-ebb-cite_ref-jQueryAuto_35-0\" class=\"reference\"><a href=\"#cite_note-jQueryAuto-35\" rel=\"external_link\">[35]<\/a><\/sup>, is used to generate autocomplete inputs.\n<\/p><p>Several APIs are used in the SistematX implementation (Table 1). MarvinJS version 15.7.20, from ChemAxon<sup id=\"rdp-ebb-cite_ref-MarvinJS_36-0\" class=\"reference\"><a href=\"#cite_note-MarvinJS-36\" rel=\"external_link\">[36]<\/a><\/sup>, is the drawing API that is integrated with ChemAxon JChem WebService<sup id=\"rdp-ebb-cite_ref-jChemEngine_37-0\" class=\"reference\"><a href=\"#cite_note-jChemEngine-37\" rel=\"external_link\">[37]<\/a><\/sup>, an external online service that transforms the drawn structure into SMILES code, after which a JChem API function turns it into a binary fingerprint. This fingerprint is used to search for molecules \u2014 using substructure or similarity \u2014 in the database via their structure. The molecule converted to the fingerprint is used as a fragment in the search, comparing it to the database molecule\u2019s fingerprints to determine if it exists inside as a fragment. This API utilizes HTML, CSS, and JavaScript to perform the transformations. We use the standardizer API of Chemaxon JChem Web Services to not only standardize nitro groups but also aromatize the structure search results as correctly as possible, making a query and the database with similar representation.\n<\/p>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table class=\"wikitable\" border=\"1\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" colspan=\"3\"><b>Table 1.<\/b> Summary of the Application Programming Interface (API) implemented by SistematX\n<\/td><\/tr>\n<tr>\n<th style=\"background-color:#dddddd; padding-left:10px; padding-right:10px;\">API\n<\/th>\n<th style=\"background-color:#dddddd; padding-left:10px; padding-right:10px;\">Description\n<\/th>\n<th style=\"background-color:#dddddd; padding-left:10px; padding-right:10px;\">Engine\n<\/th><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px; text-align:center;\" colspan=\"3\"><b>1. Structure<\/b>\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><b>2D drawing<\/b>\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Allows drawing and visualization of chemical structures\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">ChemAxon\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><b>3D generator<\/b>\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Uses 2D drawing to generate a 3D representation of the molecule\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">ChemAxon\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><b>3D<\/b>\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Graphical visualization of 3D molecules with JavaScript\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">ChemDoodle\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px; text-align:center;\" colspan=\"3\"><b>2. Compound Identification<\/b>\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><b>SMILES<\/b>\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Simplified Molecular Input Line Entry System\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">ChemAxon\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><b>IUPAC<\/b>\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">IUPAC Nomenclature\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">ChemAxon\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><b>InChI<\/b>\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">IUPAC International Chemical Identifier\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">ChemAxon\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><b>InChIKey<\/b>\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">InChIKey is a compact format of the InChI code\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">ChemAxon\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><b>CAS<\/b>\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Chemical Abstracts Service Registry Number\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">ChemAxon\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px; text-align:center;\" colspan=\"3\"><b>3. Compound Data<\/b>\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><b>NOX<\/b>\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Oxidation number (NOX) of an organic compound\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">ChemAxon\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><b>Exact Mass<\/b>\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Uses the mass of the most abundant isotope of each element\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">ChemAxon\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><b>Relative Mass<\/b>\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Uses the average atomic mass of each element\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">ChemAxon\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px; text-align:center;\" colspan=\"3\"><b>4. Geographic Data<\/b>\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><b>Latitude<\/b>\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Can be inserted by the administrator or appears by clicking in the world map\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Google Inc.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><b>Longitude<\/b>\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Can be inserted by the administrator or appears by clicking in the world map\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Google Inc.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><b>Approximate<\/b>\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Using the latitude and longitude, appears an an approximate location of the species\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Google Inc.\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"><b>Visualization<\/b>\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Uses the world map to possible to visualize the localization of the species\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Google Inc.\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>MarvinJS is used to create a 3D structure from a 2D molecule, and ChemDoodle Web Components<sup id=\"rdp-ebb-cite_ref-ChemDoodle_38-0\" class=\"reference\"><a href=\"#cite_note-ChemDoodle-38\" rel=\"external_link\">[38]<\/a><\/sup> shows this view as JavaScript in the browser. This API is able to generate several 2D and 3D molecule graphical views using pure JavaScript.\n<\/p><p>The ChemAxon API allows the visualization of compound characteristics. SistematX displays general nomenclature information such as the common name, SMILES code, IUPAC name, InChI, InChIKey, CAS registry number and properties such as oxidation number (NOX), exact mass, and relative mass.\n<\/p><p>In addition, Google Maps, from Google Inc.<sup id=\"rdp-ebb-cite_ref-GoogleMaps_39-0\" class=\"reference\"><a href=\"#cite_note-GoogleMaps-39\" rel=\"external_link\">[39]<\/a><\/sup>, an API used to prepare maps and locations, is used in the system to show the registered metabolite location on the world map. The API draws the map and receives locations from the database, which are two variables representing the latitude and longitude. A registered molecule may have multiple locations and a species linked to it. Using a JavaScript function, it graphically sets the locations on the map. When registering by clicking on the map, it sets a marker at the mouse location and adds a line to the coordinates list below the map for each marker on the map, allowing it automatically to change the position when the value in the latitude or longitude boxes is changed. The coordinates are also transformed into an approximate address, using reverse geocode, a function from the Google Maps API.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Conclusions\">Conclusions<\/span><\/h2>\n<p>In this article, we introduce a web interface for managing a secondary metabolite database, which is multiplatform and able to be consulted via the internet and managed from any accredited computer. The interface provides a wealth of useful information for the scientific community about natural products, highlighting the location of species from which the compounds were isolated. Several new functionalities will be added continuously, such as new calculated molecular descriptors, tools to aid structural elucidation using experimental and calculated NMR data, and support for downloading several structures in one file using a batch transfer data option.\n<\/p><p>Sistemat X is freely accessible on the homepage <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/sistematx.ufpb.br\" target=\"_blank\">http:\/\/sistematx.ufpb.br<\/a>.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Acknowledgments\">Acknowledgments<\/span><\/h2>\n<p>We would like to thank the Student Agreement Program of Graduate\u2014PEC-PG of CNPq, Brazil and the Brazilian National Council for Scientific and Technological Development (CNPq)\u2014Award Number 461093\/2014-6.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Author_contributions\">Author contributions<\/span><\/h2>\n<p>S.Y.K.d.O.S., R.P.O.C. and M.T.S. carried out the design, programing and drafted the manuscript. C.H.-A., T.B.O., F.B.D.C., L.S. and R.P.R. carried out the design, tests and drafted the manuscript. C.H.-A., L.S., F.B.D.C. and M.T.S. revised the manuscript critically.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Conflicts_of_interest\">Conflicts of interest<\/span><\/h2>\n<p>The authors declare no conflict of interest.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"References\">References<\/span><\/h2>\n<div class=\"reflist references-column-width\" style=\"-moz-column-width: 30em; -webkit-column-width: 30em; column-width: 30em; list-style-type: decimal;\">\n<ol class=\"references\">\n<li id=\"cite_note-BluntIsThere14-1\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BluntIsThere14_1-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Blunt, J.W.; Munro, M.H.G. (2014). \"22. Is There an Ideal Database for Natural Products Research?\". In Osbourn, A.; Goss, R.J.; Carter, G.T.. <i>Natural Products: Discourse, Diversity, and Design<\/i>. John Wiley & Sons, Inc. pp. 413\u2013431. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1002%2F9781118794623.ch22\" target=\"_blank\">10.1002\/9781118794623.ch22<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a> 9781118794623.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=22.+Is+There+an+Ideal+Database+for+Natural+Products+Research%3F&rft.atitle=Natural+Products%3A+Discourse%2C+Diversity%2C+and+Design&rft.aulast=Blunt%2C+J.W.%3B+Munro%2C+M.H.G.&rft.au=Blunt%2C+J.W.%3B+Munro%2C+M.H.G.&rft.date=2014&rft.pages=pp.%26nbsp%3B413%E2%80%93431&rft.pub=John+Wiley+%26+Sons%2C+Inc&rft_id=info:doi\/10.1002%2F9781118794623.ch22&rft.isbn=9781118794623&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HarveyTheRe15-2\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-HarveyTheRe15_2-0\" rel=\"external_link\">2.0<\/a><\/sup> <sup><a href=\"#cite_ref-HarveyTheRe15_2-1\" rel=\"external_link\">2.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Harvey, A.L.; Edrada-Ebel, R.; Quinn, R.J. (2015). \"The re-emergence of natural products for drug discovery in the genomics era\". <i>Nature Reviews Drug Discovery<\/i> <b>14<\/b> (2): 111\u201329. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1038%2Fnrd4510\" target=\"_blank\">10.1038\/nrd4510<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/25614221\" target=\"_blank\">25614221<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+re-emergence+of+natural+products+for+drug+discovery+in+the+genomics+era&rft.jtitle=Nature+Reviews+Drug+Discovery&rft.aulast=Harvey%2C+A.L.%3B+Edrada-Ebel%2C+R.%3B+Quinn%2C+R.J.&rft.au=Harvey%2C+A.L.%3B+Edrada-Ebel%2C+R.%3B+Quinn%2C+R.J.&rft.date=2015&rft.volume=14&rft.issue=2&rft.pages=111%E2%80%9329&rft_id=info:doi\/10.1038%2Fnrd4510&rft_id=info:pmid\/25614221&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CorleyStrategies94-3\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CorleyStrategies94_3-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Corley, D.G.; Durley, R.C. (1994). \"Strategies for Database Dereplication of Natural Products\". <i>Journal of Natural Products<\/i> <b>57<\/b> (11): 1484\u20131490. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1021%2Fnp50113a002\" target=\"_blank\">10.1021\/np50113a002<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Strategies+for+Database+Dereplication+of+Natural+Products&rft.jtitle=Journal+of+Natural+Products&rft.aulast=Corley%2C+D.G.%3B+Durley%2C+R.C.&rft.au=Corley%2C+D.G.%3B+Durley%2C+R.C.&rft.date=1994&rft.volume=57&rft.issue=11&rft.pages=1484%E2%80%931490&rft_id=info:doi\/10.1021%2Fnp50113a002&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-OliveiraTemporal13-4\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-OliveiraTemporal13_4-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Oliveira, T.; Chagas-Paula, D.; Rosa, A. et al. (2013). \"Temporal characteristics of a natural products in-house database\". <i>Planta Medica<\/i> <b>79<\/b>: 1113\u20131114. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1055%2Fs-0033-1351852\" target=\"_blank\">10.1055\/s-0033-1351852<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Temporal+characteristics+of+a+natural+products+in-house+database&rft.jtitle=Planta+Medica&rft.aulast=Oliveira%2C+T.%3B+Chagas-Paula%2C+D.%3B+Rosa%2C+A.+et+al.&rft.au=Oliveira%2C+T.%3B+Chagas-Paula%2C+D.%3B+Rosa%2C+A.+et+al.&rft.date=2013&rft.volume=79&rft.pages=1113%E2%80%931114&rft_id=info:doi\/10.1055%2Fs-0033-1351852&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-PenceChemSpider10-5\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-PenceChemSpider10_5-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Pence, H.E.; Williams, A. (2010). \"ChemSpider: An Online Chemical Information Resource\". <i>Journal of Chemical Education<\/i> <b>87<\/b> (11): 1123\u20131124. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1021%2Fed100697w\" target=\"_blank\">10.1021\/ed100697w<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=ChemSpider%3A+An+Online+Chemical+Information+Resource&rft.jtitle=Journal+of+Chemical+Education&rft.aulast=Pence%2C+H.E.%3B+Williams%2C+A.&rft.au=Pence%2C+H.E.%3B+Williams%2C+A.&rft.date=2010&rft.volume=87&rft.issue=11&rft.pages=1123%E2%80%931124&rft_id=info:doi\/10.1021%2Fed100697w&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BoltonPubChem08-6\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BoltonPubChem08_6-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Bolton, E.E.; Wang, Y.; Thiessen, P.A.; Bryant, S.H. (2008). \"Chapter 12 - PubChem: Integrated Platform of Small Molecules and Biological Activities\". In Wheeler, R.A.; Spellmeyer, D.C.. <i>Annual Reports in Computational Chemistry<\/i>. <b>4<\/b>. Elsevier Ltd. pp. 217-241. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2FS1574-1400%2808%2900012-1\" target=\"_blank\">10.1016\/S1574-1400(08)00012-1<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a> 9780444532503.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Chapter+12+-+PubChem%3A+Integrated+Platform+of+Small+Molecules+and+Biological+Activities&rft.atitle=Annual+Reports+in+Computational+Chemistry&rft.aulast=Bolton%2C+E.E.%3B+Wang%2C+Y.%3B+Thiessen%2C+P.A.%3B+Bryant%2C+S.H.&rft.au=Bolton%2C+E.E.%3B+Wang%2C+Y.%3B+Thiessen%2C+P.A.%3B+Bryant%2C+S.H.&rft.date=2008&rft.volume=4&rft.pages=pp.%26nbsp%3B217-241&rft.pub=Elsevier+Ltd&rft_id=info:doi\/10.1016%2FS1574-1400%2808%2900012-1&rft.isbn=9780444532503&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-DegtyarenkoChEBI08-7\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-DegtyarenkoChEBI08_7-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Degtyarenko, K.; de Matos, P.; Ennis, M. et al. (2008). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2238832\" target=\"_blank\">\"ChEBI: A database and ontology for chemical entities of biological interest\"<\/a>. <i>Nucleic Acids Research<\/i> <b>36<\/b> (DB1): D344-50. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1093%2Fnar%2Fgkm791\" target=\"_blank\">10.1093\/nar\/gkm791<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC2238832\/\" target=\"_blank\">PMC2238832<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/17932057\" target=\"_blank\">17932057<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2238832\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2238832<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=ChEBI%3A+A+database+and+ontology+for+chemical+entities+of+biological+interest&rft.jtitle=Nucleic+Acids+Research&rft.aulast=Degtyarenko%2C+K.%3B+de+Matos%2C+P.%3B+Ennis%2C+M.+et+al.&rft.au=Degtyarenko%2C+K.%3B+de+Matos%2C+P.%3B+Ennis%2C+M.+et+al.&rft.date=2008&rft.volume=36&rft.issue=DB1&rft.pages=D344-50&rft_id=info:doi\/10.1093%2Fnar%2Fgkm791&rft_id=info:pmc\/PMC2238832&rft_id=info:pmid\/17932057&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC2238832&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-IrwinZINC05-8\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-IrwinZINC05_8-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Irwin, J.J.; Shoichet, B.K. (2005). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC1360656\" target=\"_blank\">\"ZINC: A free database of commercially available compounds for virtual screening\"<\/a>. <i>Journal of Chemical Information and Modeling<\/i> <b>45<\/b> (1): 177\u201382. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1021%2Fci049714%2B\" target=\"_blank\">10.1021\/ci049714+<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC1360656\/\" target=\"_blank\">PMC1360656<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/15667143\" target=\"_blank\">15667143<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC1360656\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC1360656<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=ZINC%3A+A+free+database+of+commercially+available+compounds+for+virtual+screening&rft.jtitle=Journal+of+Chemical+Information+and+Modeling&rft.aulast=Irwin%2C+J.J.%3B+Shoichet%2C+B.K.&rft.au=Irwin%2C+J.J.%3B+Shoichet%2C+B.K.&rft.date=2005&rft.volume=45&rft.issue=1&rft.pages=177%E2%80%9382&rft_id=info:doi\/10.1021%2Fci049714%2B&rft_id=info:pmc\/PMC1360656&rft_id=info:pmid\/15667143&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC1360656&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WilliamsDerep15-9\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-WilliamsDerep15_9-0\" rel=\"external_link\">9.0<\/a><\/sup> <sup><a href=\"#cite_ref-WilliamsDerep15_9-1\" rel=\"external_link\">9.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Williams, R.B.; O'Neil-Johnson, M.; Williams, A.J. et al. (2015). \"Dereplication of natural products using minimal NMR data inputs\". <i>Organic & Biomolecular Chemistry<\/i> <b>13<\/b> (39): 9957\u201362. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1039%2Fc5ob01713k\" target=\"_blank\">10.1039\/c5ob01713k<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/26381222\" target=\"_blank\">26381222<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Dereplication+of+natural+products+using+minimal+NMR+data+inputs&rft.jtitle=Organic+%26+Biomolecular+Chemistry&rft.aulast=Williams%2C+R.B.%3B+O%27Neil-Johnson%2C+M.%3B+Williams%2C+A.J.+et+al.&rft.au=Williams%2C+R.B.%3B+O%27Neil-Johnson%2C+M.%3B+Williams%2C+A.J.+et+al.&rft.date=2015&rft.volume=13&rft.issue=39&rft.pages=9957%E2%80%9362&rft_id=info:doi\/10.1039%2Fc5ob01713k&rft_id=info:pmid\/26381222&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-EugsterRetention14-10\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-EugsterRetention14_10-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Eugster, P.J.; Boccard, J.; Debrus, B. et al. (2014). \"Retention time prediction for dereplication of natural products (CxHyOz) in LC-MS metabolite profiling\". <i>Phytochemistry<\/i> <b>108<\/b>: 196\u2013207. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.phytochem.2014.10.005\" target=\"_blank\">10.1016\/j.phytochem.2014.10.005<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/25457501\" target=\"_blank\">25457501<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Retention+time+prediction+for+dereplication+of+natural+products+%28CxHyOz%29+in+LC-MS+metabolite+profiling&rft.jtitle=Phytochemistry&rft.aulast=Eugster%2C+P.J.%3B+Boccard%2C+J.%3B+Debrus%2C+B.+et+al.&rft.au=Eugster%2C+P.J.%3B+Boccard%2C+J.%3B+Debrus%2C+B.+et+al.&rft.date=2014&rft.volume=108&rft.pages=196%E2%80%93207&rft_id=info:doi\/10.1016%2Fj.phytochem.2014.10.005&rft_id=info:pmid\/25457501&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CRCDictionary-11\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CRCDictionary_11-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dnp.chemnetbase.com\/\" target=\"_blank\">\"Dictionary of Natural Products\"<\/a>. CRC Press<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/dnp.chemnetbase.com\/\" target=\"_blank\">http:\/\/dnp.chemnetbase.com\/<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 04 March 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Dictionary+of+Natural+Products&rft.atitle=&rft.pub=CRC+Press&rft_id=http%3A%2F%2Fdnp.chemnetbase.com%2F&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GrahamTheNAPRALERT10-12\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-GrahamTheNAPRALERT10_12-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Graham, J.G.; Farnsworth, N.R. (2010). \"The NAPRALERT database as an aid for discovery of novel bioactive compounds\". <i>Comprehensive Natural Products II: Chemistry and Biology<\/i>. <b>3<\/b>. Elsevier Ltd. pp. 81\u201394. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a> 9780080453828.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=The+NAPRALERT+database+as+an+aid+for+discovery+of+novel+bioactive+compounds&rft.atitle=Comprehensive+Natural+Products+II%3A+Chemistry+and+Biology&rft.aulast=Graham%2C+J.G.%3B+Farnsworth%2C+N.R.&rft.au=Graham%2C+J.G.%3B+Farnsworth%2C+N.R.&rft.date=2010&rft.volume=3&rft.pages=pp.%26nbsp%3B81%E2%80%9394&rft.pub=Elsevier+Ltd&rft.isbn=9780080453828&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-DabbMarinLit-13\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-DabbMarinLit_13-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Dabb, S.; Blunt, J.; Munro, M. (2014). \"MarinLit: Database and essential tools for the marine natural products community\". <i>Proceedings of the 248th National Meeting of the American-Chemical-Society (ACS)<\/i> <b>248<\/b>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=MarinLit%3A+Database+and+essential+tools+for+the+marine+natural+products+community&rft.jtitle=Proceedings+of+the+248th+National+Meeting+of+the+American-Chemical-Society+%28ACS%29&rft.aulast=Dabb%2C+S.%3B+Blunt%2C+J.%3B+Munro%2C+M.&rft.au=Dabb%2C+S.%3B+Blunt%2C+J.%3B+Munro%2C+M.&rft.date=2014&rft.volume=248&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BanerjeeSuper15-14\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BanerjeeSuper15_14-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Banerjee, P.; Erehman, J.; Gohlke, B.O. et al. (2015). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4384003\" target=\"_blank\">\"Super Natural II:A database of natural products\"<\/a>. <i>Nucleic Acids Research<\/i> <b>43<\/b> (DB1): D935\u20139. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1093%2Fnar%2Fgku886\" target=\"_blank\">10.1093\/nar\/gku886<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4384003\/\" target=\"_blank\">PMC4384003<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/25300487\" target=\"_blank\">25300487<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4384003\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4384003<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Super+Natural+II%3AA+database+of+natural+products&rft.jtitle=Nucleic+Acids+Research&rft.aulast=Banerjee%2C+P.%3B+Erehman%2C+J.%3B+Gohlke%2C+B.O.+et+al.&rft.au=Banerjee%2C+P.%3B+Erehman%2C+J.%3B+Gohlke%2C+B.O.+et+al.&rft.date=2015&rft.volume=43&rft.issue=DB1&rft.pages=D935%E2%80%939&rft_id=info:doi\/10.1093%2Fnar%2Fgku886&rft_id=info:pmc\/PMC4384003&rft_id=info:pmid\/25300487&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4384003&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-DrewryEnhance10-15\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-DrewryEnhance10_15-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Drewry, D.H.; Macarron, R. (2010). \"Enhancements of screening collections to address areas of unmet medical need: An industry perspective\". <i>Current Opinions in Chemical Biology<\/i> <b>14<\/b> (3): 289\u201398. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.cbpa.2010.03.024\" target=\"_blank\">10.1016\/j.cbpa.2010.03.024<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/20413343\" target=\"_blank\">20413343<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Enhancements+of+screening+collections+to+address+areas+of+unmet+medical+need%3A+An+industry+perspective&rft.jtitle=Current+Opinions+in+Chemical+Biology&rft.aulast=Drewry%2C+D.H.%3B+Macarron%2C+R.&rft.au=Drewry%2C+D.H.%3B+Macarron%2C+R.&rft.date=2010&rft.volume=14&rft.issue=3&rft.pages=289%E2%80%9398&rft_id=info:doi\/10.1016%2Fj.cbpa.2010.03.024&rft_id=info:pmid\/20413343&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ValliDevelop13-16\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ValliDevelop13_16-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Valli, M.; dos Santos, R.N.; Figueira, L.D. et al. (2013). \"Development of a natural products database from the biodiversity of Brazil\". <i>Journal of Natural Products<\/i> <b>76<\/b> (3): 439-44. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1021%2Fnp3006875\" target=\"_blank\">10.1021\/np3006875<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/23330984\" target=\"_blank\">23330984<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Development+of+a+natural+products+database+from+the+biodiversity+of+Brazil&rft.jtitle=Journal+of+Natural+Products&rft.aulast=Valli%2C+M.%3B+dos+Santos%2C+R.N.%3B+Figueira%2C+L.D.+et+al.&rft.au=Valli%2C+M.%3B+dos+Santos%2C+R.N.%3B+Figueira%2C+L.D.+et+al.&rft.date=2013&rft.volume=76&rft.issue=3&rft.pages=439-44&rft_id=info:doi\/10.1021%2Fnp3006875&rft_id=info:pmid\/23330984&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HatherleySANCDB15-17\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HatherleySANCDB15_17-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Hatherley, R.; Brown, D.K.; Musyoka, T.M. et al. (2015). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4471313\" target=\"_blank\">\"SANCDB: A South African natural compound database\"<\/a>. <i>Journal of Cheminformatics<\/i> <b>7<\/b>: 29. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1186%2Fs13321-015-0080-8\" target=\"_blank\">10.1186\/s13321-015-0080-8<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4471313\/\" target=\"_blank\">PMC4471313<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/26097510\" target=\"_blank\">26097510<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4471313\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4471313<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=SANCDB%3A+A+South+African+natural+compound+database&rft.jtitle=Journal+of+Cheminformatics&rft.aulast=Hatherley%2C+R.%3B+Brown%2C+D.K.%3B+Musyoka%2C+T.M.+et+al.&rft.au=Hatherley%2C+R.%3B+Brown%2C+D.K.%3B+Musyoka%2C+T.M.+et+al.&rft.date=2015&rft.volume=7&rft.pages=29&rft_id=info:doi\/10.1186%2Fs13321-015-0080-8&rft_id=info:pmc\/PMC4471313&rft_id=info:pmid\/26097510&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4471313&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-KimTM-MC15-18\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-KimTM-MC15_18-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Kim, S.K.; Nam, S.; Jang, H. et al. (2015). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4495939\" target=\"_blank\">\"TM-MC: A database of medicinal materials and chemical compounds in Northeast Asian traditional medicine\"<\/a>. <i>BMC Complementary and Alternative Medicine<\/i> <b>15<\/b>: 218. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1186%2Fs12906-015-0758-5\" target=\"_blank\">10.1186\/s12906-015-0758-5<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4495939\/\" target=\"_blank\">PMC4495939<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/26156871\" target=\"_blank\">26156871<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4495939\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4495939<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=TM-MC%3A+A+database+of+medicinal+materials+and+chemical+compounds+in+Northeast+Asian+traditional+medicine&rft.jtitle=BMC+Complementary+and+Alternative+Medicine&rft.aulast=Kim%2C+S.K.%3B+Nam%2C+S.%3B+Jang%2C+H.+et+al.&rft.au=Kim%2C+S.K.%3B+Nam%2C+S.%3B+Jang%2C+H.+et+al.&rft.date=2015&rft.volume=15&rft.pages=218&rft_id=info:doi\/10.1186%2Fs12906-015-0758-5&rft_id=info:pmc\/PMC4495939&rft_id=info:pmid\/26156871&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4495939&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ChenTCM11-19\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ChenTCM11_19-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Chen, C.Y. (2011). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3017089\" target=\"_blank\">\"TCM Database@Taiwan: The world's largest traditional Chinese medicine database for drug screening in silico\"<\/a>. <i>PLoS One<\/i> <b>6<\/b> (1): e15939. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1371%2Fjournal.pone.0015939\" target=\"_blank\">10.1371\/journal.pone.0015939<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3017089\/\" target=\"_blank\">PMC3017089<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/21253603\" target=\"_blank\">21253603<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3017089\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3017089<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=TCM+Database%40Taiwan%3A+The+world%27s+largest+traditional+Chinese+medicine+database+for+drug+screening+in+silico&rft.jtitle=PLoS+One&rft.aulast=Chen%2C+C.Y.&rft.au=Chen%2C+C.Y.&rft.date=2011&rft.volume=6&rft.issue=1&rft.pages=e15939&rft_id=info:doi\/10.1371%2Fjournal.pone.0015939&rft_id=info:pmc\/PMC3017089&rft_id=info:pmid\/21253603&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3017089&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-NtieNANPDB17-20\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-NtieNANPDB17_20-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Ntie-Kang, F.; Telukunta, K.K.; D\u00f6ring, K. et al. (2017). \"NANPDB: A Resource for Natural Products from Northern African Sources\". <i>Journal of Natural Products<\/i> <b>80<\/b> (7): 2067-2076. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1021%2Facs.jnatprod.7b00283\" target=\"_blank\">10.1021\/acs.jnatprod.7b00283<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/28641017\" target=\"_blank\">28641017<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=NANPDB%3A+A+Resource+for+Natural+Products+from+Northern+African+Sources&rft.jtitle=Journal+of+Natural+Products&rft.aulast=Ntie-Kang%2C+F.%3B+Telukunta%2C+K.K.%3B+D%C3%B6ring%2C+K.+et+al.&rft.au=Ntie-Kang%2C+F.%3B+Telukunta%2C+K.K.%3B+D%C3%B6ring%2C+K.+et+al.&rft.date=2017&rft.volume=80&rft.issue=7&rft.pages=2067-2076&rft_id=info:doi\/10.1021%2Facs.jnatprod.7b00283&rft_id=info:pmid\/28641017&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-XueTCMID13-21\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-XueTCMID13_21-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Xue, R.; Fang, Z.; Zhang, M. et al. (2013). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3531123\" target=\"_blank\">\"TCMID: Traditional Chinese Medicine integrative database for herb molecular mechanism analysis\"<\/a>. <i>Nucleic Acids Research<\/i> <b>41<\/b> (DB1): D1089-95. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1093%2Fnar%2Fgks1100\" target=\"_blank\">10.1093\/nar\/gks1100<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3531123\/\" target=\"_blank\">PMC3531123<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/23203875\" target=\"_blank\">23203875<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3531123\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3531123<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=TCMID%3A+Traditional+Chinese+Medicine+integrative+database+for+herb+molecular+mechanism+analysis&rft.jtitle=Nucleic+Acids+Research&rft.aulast=Xue%2C+R.%3B+Fang%2C+Z.%3B+Zhang%2C+M.+et+al.&rft.au=Xue%2C+R.%3B+Fang%2C+Z.%3B+Zhang%2C+M.+et+al.&rft.date=2013&rft.volume=41&rft.issue=DB1&rft.pages=D1089-95&rft_id=info:doi\/10.1093%2Fnar%2Fgks1100&rft_id=info:pmc\/PMC3531123&rft_id=info:pmid\/23203875&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3531123&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-AfendiKNApSAcK12-22\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-AfendiKNApSAcK12_22-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Afendi, F.M.; Okada, T.; Yamazaki, M. et al. (2012). \"KNApSAcK family databases: Integrated metabolite-plant species databases for multifaceted plant research\". <i>Plsnt & Cell Physiology<\/i> <b>53<\/b> (2): e1. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1093%2Fpcp%2Fpcr165\" target=\"_blank\">10.1093\/pcp\/pcr165<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/22123792\" target=\"_blank\">22123792<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=KNApSAcK+family+databases%3A+Integrated+metabolite-plant+species+databases+for+multifaceted+plant+research&rft.jtitle=Plsnt+%26+Cell+Physiology&rft.aulast=Afendi%2C+F.M.%3B+Okada%2C+T.%3B+Yamazaki%2C+M.+et+al.&rft.au=Afendi%2C+F.M.%3B+Okada%2C+T.%3B+Yamazaki%2C+M.+et+al.&rft.date=2012&rft.volume=53&rft.issue=2&rft.pages=e1&rft_id=info:doi\/10.1093%2Fpcp%2Fpcr165&rft_id=info:pmid\/22123792&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-TungTIPdb14-23\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-TungTIPdb14_23-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Tung, C.W.; Lin, Y.C.; Chang, H.S. et al. (2014). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4057645\" target=\"_blank\">\"TIPdb-3D: The three-dimensional structure database of phytochemicals from Taiwan indigenous plants\"<\/a>. <i>Database<\/i> <b>2014<\/b>: bau055. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1093%2Fdatabase%2Fbau055\" target=\"_blank\">10.1093\/database\/bau055<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4057645\/\" target=\"_blank\">PMC4057645<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/24930145\" target=\"_blank\">24930145<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4057645\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4057645<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=TIPdb-3D%3A+The+three-dimensional+structure+database+of+phytochemicals+from+Taiwan+indigenous+plants&rft.jtitle=Database&rft.aulast=Tung%2C+C.W.%3B+Lin%2C+Y.C.%3B+Chang%2C+H.S.+et+al.&rft.au=Tung%2C+C.W.%3B+Lin%2C+Y.C.%3B+Chang%2C+H.S.+et+al.&rft.date=2014&rft.volume=2014&rft.pages=bau055&rft_id=info:doi\/10.1093%2Fdatabase%2Fbau055&rft_id=info:pmc\/PMC4057645&rft_id=info:pmid\/24930145&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4057645&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ABCAsterDB-24\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ABCAsterDB_24-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.asterbiochem.org\/asterdb\" target=\"_blank\">\"AsterDB\"<\/a>. AsterBioChem<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.asterbiochem.org\/asterdb\" target=\"_blank\">http:\/\/www.asterbiochem.org\/asterdb<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 04 March 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=AsterDB&rft.atitle=&rft.pub=AsterBioChem&rft_id=http%3A%2F%2Fwww.asterbiochem.org%2Fasterdb&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-SampaioEffect16-25\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-SampaioEffect16_25-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Sampaio, B.L.; Edrada-Ebel, R.; Da Costa, F.B. (2016). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4935878\" target=\"_blank\">\"Effect of the environment on the secondary metabolic profile of Tithonia diversifolia: A model for environmental metabolomics of plants\"<\/a>. <i>Scientific Reports<\/i> <b>6<\/b>: 29265. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1038%2Fsrep29265\" target=\"_blank\">10.1038\/srep29265<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4935878\/\" target=\"_blank\">PMC4935878<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/27383265\" target=\"_blank\">27383265<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4935878\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4935878<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Effect+of+the+environment+on+the+secondary+metabolic+profile+of+Tithonia+diversifolia%3A+A+model+for+environmental+metabolomics+of+plants&rft.jtitle=Scientific+Reports&rft.aulast=Sampaio%2C+B.L.%3B+Edrada-Ebel%2C+R.%3B+Da+Costa%2C+F.B.&rft.au=Sampaio%2C+B.L.%3B+Edrada-Ebel%2C+R.%3B+Da+Costa%2C+F.B.&rft.date=2016&rft.volume=6&rft.pages=29265&rft_id=info:doi\/10.1038%2Fsrep29265&rft_id=info:pmc\/PMC4935878&rft_id=info:pmid\/27383265&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4935878&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-SchmidtLarrea12-26\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-SchmidtLarrea12_26-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Schmidt, T.J.; Rzeppa, S.; Kaiser, M.; Brun, R. (2012). \"Larrea tridentata\u2014Absolute configuration of its epoxylignans and investigations on its antiprotozoal activity\". <i>Phytochemistry Letters<\/i> <b>5<\/b> (3): 632-638. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.phytol.2012.06.011\" target=\"_blank\">10.1016\/j.phytol.2012.06.011<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Larrea+tridentata%E2%80%94Absolute+configuration+of+its+epoxylignans+and+investigations+on+its+antiprotozoal+activity&rft.jtitle=Phytochemistry+Letters&rft.aulast=Schmidt%2C+T.J.%3B+Rzeppa%2C+S.%3B+Kaiser%2C+M.%3B+Brun%2C+R.&rft.au=Schmidt%2C+T.J.%3B+Rzeppa%2C+S.%3B+Kaiser%2C+M.%3B+Brun%2C+R.&rft.date=2012&rft.volume=5&rft.issue=3&rft.pages=632-638&rft_id=info:doi\/10.1016%2Fj.phytol.2012.06.011&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GaquerelComput13-27\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-GaquerelComput13_27-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Gaquerel, E.; Kuhl, C.; Neumann, S. (2013). \"Computational annotation of plant metabolomics profiles via a novel network-assisted approach\". <i>Metabolomics<\/i> <b>9<\/b> (4): 904\u2013918. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1007%2Fs11306-013-0504-2\" target=\"_blank\">10.1007\/s11306-013-0504-2<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Computational+annotation+of+plant+metabolomics+profiles+via+a+novel+network-assisted+approach&rft.jtitle=Metabolomics&rft.aulast=Gaquerel%2C+E.%3B+Kuhl%2C+C.%3B+Neumann%2C+S.&rft.au=Gaquerel%2C+E.%3B+Kuhl%2C+C.%3B+Neumann%2C+S.&rft.date=2013&rft.volume=9&rft.issue=4&rft.pages=904%E2%80%93918&rft_id=info:doi\/10.1007%2Fs11306-013-0504-2&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HendricksonOrganic70-28\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HendricksonOrganic70_28-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Hendrickson, J.B.; Cram, D.J.; Hammond, G.S. (1970). <i>Organic Chemistry<\/i>. McGraw-Hill. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a> 070281505.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=book&rft.btitle=Organic+Chemistry&rft.aulast=Hendrickson%2C+J.B.%3B+Cram%2C+D.J.%3B+Hammond%2C+G.S.&rft.au=Hendrickson%2C+J.B.%3B+Cram%2C+D.J.%3B+Hammond%2C+G.S.&rft.date=1970&rft.pub=McGraw-Hill&rft.isbn=070281505&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GottliebTheRole89-29\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-GottliebTheRole89_29-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Gottlieb, O. (1989). \"The role of oxygen in phytochemical evolution towards diversity\". <i>Phytochemistry<\/i> <b>28<\/b>: 2545\u20132558.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+role+of+oxygen+in+phytochemical+evolution+towards+diversity&rft.jtitle=Phytochemistry&rft.aulast=Gottlieb%2C+O.&rft.au=Gottlieb%2C+O.&rft.date=1989&rft.volume=28&rft.pages=2545%E2%80%932558&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-Z.C3.BCstNatural12-30\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-Z.C3.BCstNatural12_30-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Z\u00fcst, T.; Heichinger, C.; Grossniklaus, U. et al. (2012). \"Natural enemies drive geographic variation in plant defenses\". <i>Science<\/i> <b>338<\/b> (6103): 116\u20139. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1126%2Fscience.1226397\" target=\"_blank\">10.1126\/science.1226397<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/23042895\" target=\"_blank\">23042895<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Natural+enemies+drive+geographic+variation+in+plant+defenses&rft.jtitle=Science&rft.aulast=Z%C3%BCst%2C+T.%3B+Heichinger%2C+C.%3B+Grossniklaus%2C+U.+et+al.&rft.au=Z%C3%BCst%2C+T.%3B+Heichinger%2C+C.%3B+Grossniklaus%2C+U.+et+al.&rft.date=2012&rft.volume=338&rft.issue=6103&rft.pages=116%E2%80%939&rft_id=info:doi\/10.1126%2Fscience.1226397&rft_id=info:pmid\/23042895&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MySQLDownload-31\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-MySQLDownload_31-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/dev.mysql.com\/downloads\/mysql\/\" target=\"_blank\">\"Download MySQL Community Server\"<\/a>. Oracle Corporation<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/dev.mysql.com\/downloads\/mysql\/\" target=\"_blank\">https:\/\/dev.mysql.com\/downloads\/mysql\/<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 31 January 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Download+MySQL+Community+Server&rft.atitle=&rft.pub=Oracle+Corporation&rft_id=https%3A%2F%2Fdev.mysql.com%2Fdownloads%2Fmysql%2F&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ScottiSistematiX15-32\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ScottiSistematiX15_32-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Scotti, M.T.; Da Silva Junior, R.O.; De Oliveira Santos, S.Y.K. et al. (2015). [<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/sciforum.net\/conference\/MOL2NET-1\/paper\/3348\/download\/pdf\" target=\"_blank\">https:\/\/sciforum.net\/conference\/MOL2NET-1\/paper\/3348\/download\/pdf<\/a> \"SISTEMAT X - A Web Tool to Manage Databases of\nSecondary Metabolites\"]. <i>Proceedings of MOL2NET, International Conference on Multidisciplinary Sciences<\/i> <b>1<\/b> (Section F): 1\u20136<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/sciforum.net\/conference\/MOL2NET-1\/paper\/3348\/download\/pdf\" target=\"_blank\">https:\/\/sciforum.net\/conference\/MOL2NET-1\/paper\/3348\/download\/pdf<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=SISTEMAT+X+-+A+Web+Tool+to+Manage+Databases+of%0ASecondary+Metabolites&rft.jtitle=Proceedings+of+MOL2NET%2C+International+Conference+on+Multidisciplinary+Sciences&rft.aulast=Scotti%2C+M.T.%3B+Da+Silva+Junior%2C+R.O.%3B+De+Oliveira+Santos%2C+S.Y.K.+et+al.&rft.au=Scotti%2C+M.T.%3B+Da+Silva+Junior%2C+R.O.%3B+De+Oliveira+Santos%2C+S.Y.K.+et+al.&rft.date=2015&rft.volume=1&rft.issue=Section+F&rft.pages=1%E2%80%936&rft_id=https%3A%2F%2Fsciforum.net%2Fconference%2FMOL2NET-1%2Fpaper%2F3348%2Fdownload%2Fpdf&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-Bootstrap-33\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-Bootstrap_33-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"http:\/\/getbootstrap.com\/\" target=\"_blank\">\"Bootstrap\"<\/a>. Bootstrap Development Team<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/getbootstrap.com\/\" target=\"_blank\">http:\/\/getbootstrap.com\/<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 10 January 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Bootstrap&rft.atitle=&rft.pub=Bootstrap+Development+Team&rft_id=http%3A%2F%2Fgetbootstrap.com%2F&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-jQuery-34\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-jQuery_34-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"http:\/\/jquery.com\/\" target=\"_blank\">\"jQuery - Write less, do more\"<\/a>. The jQuery Foundation<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/jquery.com\/\" target=\"_blank\">http:\/\/jquery.com\/<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 17 January 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=jQuery+-+Write+less%2C+do+more&rft.atitle=&rft.pub=The+jQuery+Foundation&rft_id=http%3A%2F%2Fjquery.com%2F&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-jQueryAuto-35\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-jQueryAuto_35-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"http:\/\/jqueryui.com\/autocomplete\/\" target=\"_blank\">\"jQuery - User interface\"<\/a>. The jQuery Foundation<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/jqueryui.com\/autocomplete\/\" target=\"_blank\">http:\/\/jqueryui.com\/autocomplete\/<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 20 January 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=jQuery+-+User+interface&rft.atitle=&rft.pub=The+jQuery+Foundation&rft_id=http%3A%2F%2Fjqueryui.com%2Fautocomplete%2F&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MarvinJS-36\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-MarvinJS_36-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/chemaxon.com\/products\/marvin-js\" target=\"_blank\">\"Marvin JS\"<\/a>. ChemAxon Ltd<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/chemaxon.com\/products\/marvin-js\" target=\"_blank\">https:\/\/chemaxon.com\/products\/marvin-js<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 24 January 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Marvin+JS&rft.atitle=&rft.pub=ChemAxon+Ltd&rft_id=https%3A%2F%2Fchemaxon.com%2Fproducts%2Fmarvin-js&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-jChemEngine-37\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-jChemEngine_37-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/chemaxon.com\/products\/jchem-engines\" target=\"_blank\">\"JChem Engines\"<\/a>. ChemAxon Ltd<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/chemaxon.com\/products\/jchem-engines\" target=\"_blank\">https:\/\/chemaxon.com\/products\/jchem-engines<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 24 January 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=JChem+Engines&rft.atitle=&rft.pub=ChemAxon+Ltd&rft_id=https%3A%2F%2Fchemaxon.com%2Fproducts%2Fjchem-engines&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ChemDoodle-38\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ChemDoodle_38-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/web.chemdoodle.com\/\" target=\"_blank\">\"ChemDoodle Web Components\"<\/a>. iChemLabs, LLC<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/web.chemdoodle.com\/\" target=\"_blank\">https:\/\/web.chemdoodle.com\/<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 24 January 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=ChemDoodle+Web+Components&rft.atitle=&rft.pub=iChemLabs%2C+LLC&rft_id=https%3A%2F%2Fweb.chemdoodle.com%2F&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GoogleMaps-39\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-GoogleMaps_39-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/developers.google.com\/maps\/\" target=\"_blank\">\"Google Maps APIs\"<\/a>. Google Inc<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/developers.google.com\/maps\/\" target=\"_blank\">https:\/\/developers.google.com\/maps\/<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 13 January 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Google+Maps+APIs&rft.atitle=&rft.pub=Google+Inc&rft_id=https%3A%2F%2Fdevelopers.google.com%2Fmaps%2F&rfr_id=info:sid\/en.wikipedia.org:Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<\/ol><\/div>\n<h2><span class=\"mw-headline\" id=\"Notes\">Notes<\/span><\/h2>\n<p>This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added.\n<\/p>\n<!-- \nNewPP limit report\nCached time: 20181214185733\nCache expiry: 86400\nDynamic content: false\nCPU time usage: 0.913 seconds\nReal time usage: 0.952 seconds\nPreprocessor visited node count: 29457\/1000000\nPreprocessor generated node count: 38098\/1000000\nPost\u2010expand include size: 219083\/2097152 bytes\nTemplate argument size: 68207\/2097152 bytes\nHighest expansion depth: 18\/40\nExpensive parser function count: 0\/100\n-->\n\n<!-- \nTransclusion expansion time report (%,ms,calls,template)\n100.00% 908.874 1 - -total\n 87.34% 793.823 1 - Template:Reflist\n 76.08% 691.460 39 - Template:Citation\/core\n 52.27% 475.112 25 - Template:Cite_journal\n 14.89% 135.334 10 - Template:Cite_web\n 13.68% 124.295 4 - Template:Cite_book\n 7.99% 72.649 54 - Template:Citation\/identifier\n 7.87% 71.558 1 - Template:Infobox_journal_article\n 7.50% 68.205 1 - Template:Infobox\n 4.51% 40.969 80 - Template:Infobox\/row\n-->\n\n<!-- Saved in parser cache with key limswiki:pcache:idhash:10388-0!*!0!!en!5!* and timestamp 20181214185732 and revision id 32236\n -->\n<\/div><div class=\"printfooter\">Source: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites\">https:\/\/www.limswiki.org\/index.php\/Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites<\/a><\/div>\n\t\t\t\t\t\t\t\t\t\t<!-- end content -->\n\t\t\t\t\t\t\t\t\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t\t<!-- end of the left (by default at least) column -->\n\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t\t\n\t\t<\/div>\n\t\t\n\n<\/body>","478e0fc6bdbd74f64773b750f2c9edcc_images":["https:\/\/www.limswiki.org\/images\/b\/b4\/Fig1_Scotti_Molecules2018_23-1.png","https:\/\/www.limswiki.org\/images\/5\/56\/Fig2_Scotti_Molecules2018_23-1.png","https:\/\/www.limswiki.org\/images\/3\/37\/Fig3_Scotti_Molecules2018_23-1.png","https:\/\/www.limswiki.org\/images\/1\/16\/Fig4_Scotti_Molecules2018_23-1.png","https:\/\/www.limswiki.org\/images\/0\/05\/Fig5_Scotti_Molecules2018_23-1.png"],"478e0fc6bdbd74f64773b750f2c9edcc_timestamp":1544813852,"684d7a3a2f6583b431b16f4884c7c07d_type":"article","684d7a3a2f6583b431b16f4884c7c07d_title":"Closha: Bioinformatics workflow system for the analysis of massive sequencing data (Ko et al. 2018)","684d7a3a2f6583b431b16f4884c7c07d_url":"https:\/\/www.limswiki.org\/index.php\/Journal:Closha:_Bioinformatics_workflow_system_for_the_analysis_of_massive_sequencing_data","684d7a3a2f6583b431b16f4884c7c07d_plaintext":"\n\n\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\n\n\t\t\t\tJournal:Closha: Bioinformatics workflow system for the analysis of massive sequencing data\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tFrom LIMSWiki\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\tJump to: navigation, search\n\n\t\t\t\t\t\n\t\t\t\t\tFull article title\n \nClosha: Bioinformatics workflow system for the analysis of massive sequencing dataJournal\n \nBMC BioinformaticsAuthor(s)\n \nKo, GunHwan; Kim, Pan-Gyu; Yoon, Jongcheol; Han, Gukhee; Park, Seong-Jin; Song, Wangho; Lee, ByungwookAuthor affiliation(s)\n \nKorean BioInformation CenterPrimary contact\n \nEmail: bulee at kribb dot re dot krYear published\n \n2018Volume and issue\n \n19(Suppl 1)Page(s)\n \n43DOI\n \n10.1186\/s12859-018-2019-3ISSN\n \n1471-2105Distribution license\n \nCreative Commons Attribution 4.0 InternationalWebsite\n \nhttps:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-018-2019-3Download\n \nhttps:\/\/bmcbioinformatics.biomedcentral.com\/track\/pdf\/10.1186\/s12859-018-2019-3 (PDF)\n\nContents\n\n1 Abstract \n2 Background \n3 Methods \n\n3.1 Goals of Closha \n3.2 Cluster configuration \n3.3 Closha workspace \n3.4 Workflow editor (canvas) \n3.5 Representing analysis pipelines of workflows \n3.6 Uploading data to Closha \n3.7 Hybrid system \n3.8 Elastic scalability \n\n\n4 Results \n\n4.1 Analysis pipelines \n4.2 RNA-Seq pipeline \n4.3 Creating a new pipeline \n\n\n5 Discussion \n6 Conclusions \n7 Declarations \n\n7.1 Acknowledgments \n\n7.1.1 Funding \n7.1.2 About this supplement \n7.1.3 Author\u2019s contributions \n7.1.4 Competing interests \n\n\n\n\n8 References \n9 Notes \n\n\n\nAbstract \nBackground: While next-generation sequencing (NGS) costs have fallen in recent years, the cost and complexity of computation remain substantial obstacles to the use of NGS in bio-medical care and genomic research. The rapidly increasing amounts of data available from the new high-throughput methods have made data processing infeasible without automated pipelines. The integration of data and analytic resources into workflow systems provides a solution to the problem by simplifying the task of data analysis.\nResults: To address this challenge, we developed a cloud-based workflow management system, Closha, to provide fast and cost-effective analysis of massive genomic data. We implemented complex workflows making optimal use of high-performance computing clusters. Closha allows users to create multi-step analyses using drag-and-drop functionality and to modify the parameters of pipeline tools. Users can also import Galaxy pipelines into Closha. Closha is a hybrid system that enables users to use both analysis programs providing traditional tools and MapReduce-based big data analysis programs simultaneously in a single pipeline. Thus, the execution of analytics algorithms can be parallelized, speeding up the whole process. We also developed a high-speed data transmission solution, KoDS, to transmit a large amount of data at a fast rate. KoDS has a file transfer speed of up to 10 times that of normal FTP and HTTP. The computer hardware for Closha is 660 CPU cores and 800 TB of disk storage, enabling 500 jobs to run at the same time.\nConclusions: Closha is a scalable, cost-effective, and publicly available web service for large-scale genomic data analysis. Closha supports the reliable and highly scalable execution of sequencing analysis workflows in a fully automated manner. Closha provides a user-friendly interface to all genomic scientists to try to derive accurate results from NGS platform data. The Closha cloud server is freely available for use from http:\/\/closha.kobic.re.kr\/.\n\nBackground \nWith the emergence of next-generation sequencing (NGS) technology in 2005, the field of genomics is caught in a data deluge. Modern sequencing platforms are capable of sequencing approximately 5000 M-bases per day.[1] DNA sequencing is becoming faster and less expensive at a pace far outstripping Moore\u2019s law, which describes the rate at which computing becomes faster and less expensive. As a result of the increased efficiency and diminished cost of NGS, the demand for clinical and agricultural applications is rapidly increasing.[2] In the bioinformatics community, acquiring massive sequencing data is always followed by large-scale computational analysis to process the data and obtain scientific insights. Therefore, investment in a sequencing instrument would normally be accompanied by substantial investment in computer hardware, analysis pipelines, and bioinformatics experts to analyze the data.[3]\nWhen genomic datasets were small, they could be analyzed on personal computers in a few hours or perhaps overnight.[4] However, this approach does not apply to large NGS datasets. Instead, researchers require high-performance computers and parallel algorithms to analyze their big genomic data in a timely manner.[5] While high-performance computing is essential for data analysis, only a small number of biomedical research labs are equipped to make effective and successful use of parallel computers.[6] Obstacles include the complexities inherent in managing large NGS datasets and assembling and configuring multi-step genome sequencing pipelines, as well as the difficulties inherent in adapting pipelines to process NGS data on parallel computers.[7]\nThe difficulties in creating these complicated computational pipelines, installing and maintaining software packages, and obtaining sufficient computational resources tend to overwhelm bench biologists and prevent them from attempting to analyze their own genomic data.[8] Despite the availability of a vast set of computational tools and methods for genomic data analysis[1], it is still challenging for a genomic researcher to organize these tools, integrate them into workable pipelines, find accessible computational platforms, configure the computing environment, and perform the actual analysis.\nTo address these challenges, the MapReduce[9] model and the corresponding Apache Hadoop framework have been widely adopted to handle large data sets using parallel processing tools.[10] The most widely used open-source implementation of the MapReduce programming model for big data batch processing is Apache Hadoop. A cloud-based bioinformatics workflow platform has also been proposed for genomic researchers. Scientific workflow systems such as Galaxy[11] and Taverna[12] offer simple web-based workflow toolkits and scalable computing environments to meet this challenge.\nSuch efforts have resulted in significant insight into the technical requirements to leverage cloud computing for the analysis of genomic data[7], but problems still remain to be solved. Even though many applications have been developed for the analysis of genomic data, they are either tools running only on a MapReduce platform such as Hadoop BAM[13] or Crossbow[14] or general-purpose (mainly Linux-based) programs such as bowtie[2] and bwa.[15] It is crucial to integrate these two types of platform-based applications on a single pipeline. Transferring these big data is another problem, as NGS genomic data is too large to use cloud computing platform services.[16]\nWe developed an automatic workflow management system, Closha, to provide a pipeline-based analysis service for massive biological data, especially NGS genomic data. Closha was developed as a hybrid system that can run both Hadoop-based and general-purpose applications on a single analysis pipeline. We also developed a high-speed data transmission solution, KoDS, to transmit a large amount of data at a fast rate. Closha makes it simple to create multi-step analysis using a simple drag-and-drop functionality. Using Closha, programs can be added and connected to each other so that the output of one program becomes the input of other programs. Our cloud-based workflow management system can help users to run in-house pipelines or construct a series of steps in an organized way.\n\nMethods \nGoals of Closha \nThe following three objectives drive the development of Closha. First, Closha seeks to increase access to intricate computational analyses for all genomic researchers, including those with limited or no programming knowledge. Our web-based graphical user interface (GUI) makes it simple to do everything needed for relatively large data analyses. Second, the Closha GUI provides a workflow editor in which users can simply create automated, multi-step analysis pipelines using drag-and-drop. Here, workflows refer to structured procedures that help users construct a series of steps in an organized way. Each step is a specific parametrized action that receives input and produces output. The analysis pipelines on Closha are exactly reproducible, and all analysis parameters and inputs are permanently recorded. Lastly, Closha enables users to share their pipelines on the web.\n\nCluster configuration \nAll runs of analysis pipelines on Closha are performed on a cluster of five master nodes and 33 data (slave) nodes (Fig. 1). The Closha hardware system consists of 660 core CPUs, 2 TB of memory, and 800 TB of disk storage in total. Each node has an Intel Xeon E502690 v2 3.0 GHz CPU, 96 GB of memory, and 28 TB of disk storage. The data node HDD configuration consists of the Hadoop Distributed File System (HDFS) and a solid state drive (SSD) cache. HDFS is the primary distributed storage used by Hadoop applications. An SSD is a flash-based storage drive that is many times faster than a traditional hard drive, so using an SSD in the data node makes it possible to run Linux-based programs on the Hadoop cluster system. Edge nodes (gateway nodes) are the interface between the Hadoop cluster and the outside network. The edge nodes are commonly used to run client applications and cluster administration tools. The node manager (NM) handles the individual data nodes in a Hadoop cluster.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 1. The architecture of the Closha system. The Closha system consists of distributed computing nodes: the master node (name node), slave nodes (data node), edge nodes, and node manager.\n\n\n\nClosha workspace \nThe Closha GUI workspace is divided into eight panels that show information on the user\u2019s projects, the file explorer, the pipeline modelling screen (canvas), the analysis programs (program panel), the analysis program parameters, the analysis pipeline list (pipeline panel), the list of analysis programs available for use, and the job execution history and current progress (execution and history panel) (Fig. 2).\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 2. The interface of the Closha workspace. The web-based Closha workflow editor has several panels: (a) the pipeline project list, (b) the file explorer, (c) the canvas: pipeline modeling screen, (d) a table detailing the analysis program, (e) a table detailing the analysis program parameters, (f) the analysis pipeline list, (g) the list of analysis programs available for use, and (h) the pipeline project job execution history and current progress.\n\n\n\nAnalysis pipelines are grouped into categories and can be searched on the pipeline panel. When a pipeline is selected, it is shown in the main window, where its parameters are set and the tool is executed. When a user executes a tool, its output datasets are added to the execution and history panel. The colors on the execution panel shows the state of tool execution. Clicking on a dataset in the panel provides a wealth of information, including the tool and parameter settings used to create it.\n\nWorkflow editor (canvas) \nThe canvas is an interface for creating and modifying workflows (analysis pipelines) by arranging and connecting activities to drive processes. The canvas provides the working surface for creating new workflows or editing existing ones. Users can create custom workflows or use existing workflows on the screen. The canvas (Fig. 2) makes it simple to create multi-step analyses using drag-and-drop functionality. Using the canvas, existing and user-uploaded tools can be added and connected so that the output of one tool becomes the input of other tools. Tool parameters can be set in the parameter panel. Workflows enable the automation and repeated running of large analyses. Once created, workflows function as tools. They can be accessed and run from Closha\u2019s main analysis interface.\n\nRepresenting analysis pipelines of workflows \nThe workflows in the analysis pipelines are commonly depicted as directed acyclical graphs, in which each of the vertices (modules or programs) has a unique identifier and represents a task to be performed. Additionally, each of the tasks in a workflow can receive inputs and can produce outputs. The outputs of a task can be directed through another task as input. An edge (connector) between two vertices represents the channeling of an output from one task into another. Edges determine the logical sequence. A task can be executed once all of its inputs can be resolved.\n\nUploading data to Closha \nWe developed a fast file transfer tool, called KoDS, for uploading massive genomic data such as exome and RNA-Seq (RNA sequencing) data to the Closha server from the user\u2019s local computer and for downloading the resulting files to the local computer (Fig. 3). The client program of KoDS can be downloaded from the Closha website and be installed on the user\u2019s computer. The KoDS transfer platform provides users with secure high-speed movement of all of their data, supporting a wide range of server, desktop, and Linux operating systems. Using KoDS, users can simultaneously upload an unlimited number of files to Closha. KoDS has a file transfer speed up to 10 times that of normal FTP and HTTP protocols.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 3. Screenshot of the KoDS tool. The left window is the user\u2019s local computer and the right is the Closha server.\n\n\n\nHybrid system \nWe implemented a service-oriented architecture, a hybrid system, to allow arbitrary tools to be described as services. The hybrid system provides access to traditional applications on a cloud infrastructure, which enables users to use both the MapReduce tools and the traditional programs in a single pipeline simultaneously. Thus, the execution of analytical algorithms can be parallelized, speeding up the whole process.\n\nElastic scalability \nScalability is the capability of a system, network, or process to handle a growing amount of work or its potential to be enlarged to accommodate that growth. For example, a system is considered scalable if it can increase its total output under an increased load when resources (typically hardware) are added. A system whose performance improves after adding hardware, in proportion to the capacity added, is said to be a scalable system. Scalability is one of the most attractive prospects of the benefit-rich phenomenon of cloud computing and provides a useful safety net for when a user\u2019s needs and demands change. The resource manager and the job controller on Closha elastically control the scalability by either increasing or decreasing the required resources.\n\nResults \nAnalysis pipelines \nAs of October 1st, 2017, approximately 200 analysis tools were installed on Closha, and 20 analysis pipelines were available for the analysis of exome, RNA-Seq, and ChiP-Seq, data, among others. Closha has two types of pipelines: registered and new. Users can use a registered pipeline suitable for their genomic data by selecting a pipeline in the Closha analysis pipeline list. If users want to create a new analysis pipeline, they can build their own pipeline either from scratch or by modifying a registered pipeline with installed or user-defined tools.\n\nRNA-Seq pipeline \nWe use a representative NGS analysis workflow of RNA-Seq to examine the time and cost of execution on Closha cloud configurations. RNA-Seq is a deep-sequencing technique used to explore and profile the entire transcriptome of any organism. Analyzing an organism\u2019s transcriptome is important for understanding the functional elements of a genome. We built an RNA-Seq analysis pipeline in which we use the KoDS tool to move data from a local machine to the Closha server. Figure 4a shows a schematic overview of the RNA-Seq pipeline. Then, we can obtain the resulting output data at the end of the pipeline.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 4. Screenshot of the RNA-Seq schematic diagram and its pipeline. (a) Schematic overview of the RNA-Seq pipeline. (b) The RNA-Seq pipeline implemented on the Closha canvas\n\n\n\nThe pipeline includes five analysis tools: TopHat[17], Cufflinks, Cuffmerge, Cuffdiff, and limma voom.[18] TopHat is a fast splice junction mapper that is used to align RNA-Seq reads to large genomes and analyze the mapping results to identify splicing junctions between exons. TopHat internally uses the Bowtie tool, an ultra-high-throughput short read aligner. Cufflinks is used to assemble these alignments into a parsimonious set of transcripts and then estimate the relative abundances of these transcripts. The main purpose of Cuffmerge is to merge several Cufflinks assemblies, making it easier to produce an assembly GTF file suitable for use with Cuffdiff. Cuffdiff is then used to find significant changes in transcript expression, splicing, and promoter use. Finally, voom robustly estimates the mean-variance relationship and generates a precision weight for each individual normalized observation. It can be used to calculate differently expressed genes (DEGs) from the transcript expression level. Figure 4b depicts the implemented RNA-Seq pipeline on the Closha canvas.\nTo evaluate the execution of the RNA-Seq pipeline in Closha, we used an RNA-Seq case-control sample data set: 42,112,235 paired-end case reads and 40,975,645 paired-end control reads. The total sample size of the case and the control reads is approximately 42GB. The execution of the RNA-Seq DEG pipeline on the case and the control data provided the baseline runtime speed. Closha assigned four CPU cores and 16GB of memory for a single RNA-Seq job. The execution of the RNA-Seq pipeline on the sample data using Closha took a total of three hours and 44 minutes, and most of the time was spent on the running of the TopHat2 program (two hours and 36 minutes). We performed a comparison experiment between Closha and Galaxy with the same data and the same RNA-Seq pipeline. The same machine was used for the comparison. The execution time using Galaxy was six hours and 11 minutes, showing Closha has approximately 1.7 times better performance than Galaxy in the execution of the RNA-Seq pipeline (Table 1).\n\n\n\n\n\n\n\nTable 1. Execution time of each program of Closha and Galaxy in the RNA-Seq analysis\n\n\nAnalysis steps (programs)\n\nRunning time\n\n\nClosha\n\nGalaxy\n\n\nData transfer\n\nNine minutes\n\nOne hour and 14 minutes\n\n\nFastQC\n\nThree minutes\n\nFive minutes\n\n\nSickle\n\nThree minutes\n\n11 minutes\n\n\nTopHat2\n\nTwo hours and 36 minutes\n\nThree hours and four minutes\n\n\nSAMtools\n\n13 minutes\n\n15 minutes\n\n\nCufflinks\n\n10 minutes\n\n16 minutes\n\n\nCuffdiff and voom\n\n30 minutes\n\nOne hour and 6 minutes\n\n\n\n\nTotal running time: Three hours and 44 minutes\n\nTotal running time: Six hours and 11 minutes\n\n\n\nTo simulate real use with multiple executions, we performed batched jobs of the example data simultaneously, scaling up by adding 100 jobs of the sample data. We found little change in execution time as the number of batched jobs increased, which means that the Closha cloud system can run an RNA-Seq pipeline of up to 500 jobs at the same time with little change in execution time (Table 2).\n\n\n\n\n\n\n\nTable 2. Running time of multiple jobs\n\n\nNumber of jobs\n\n100\n\n200\n\n300\n\n400\n\n500\n\n\nRunning time of each job\n\nThree minutes and 44 seconds\n\nThree minutes and 59 seconds\n\nThree minutes and 53 seconds\n\nThree minutes and 42 seconds\n\nThree minutes and 58 seconds\n\n\n\nCreating a new pipeline \nClosha allows users to create their own pipelines to analyze their own data on the canvas. To create a new analysis pipeline, users click the \u2018New Pipeline\u2019 button in the top menu of Closha, enter the name and description of the pipeline, and select an analysis pipeline type. Users will have only the [Start] and [End] modules on the canvas immediately upon creating a pipeline after selecting a \"New analysis pipeline design\" in the project type. Users can drag and drop their desired analysis programs in the list of analysis programs on the right of the canvas. Upon positioning a desired analysis program on the canvas, when the users places the mouse over the edge of the analysis program icon, a connection mark will be created that can be drawn to the module. Starting from the mark, the connector must be dragged until the icon of the next analysis program to be connected turns translucent. Users can make connections to the start module, the analysis program and the end module using this method to perform the analysis.\nThen, users can set the parameter values by clicking the \"Set Parameters\" button on the toolbar before executing the pipeline project. On the creation of an initial project, default parameter values are automatically assigned. Users can change the parameter values in accordance with the conditions required to set and analyze their input data. To connect user files to Closha, the user can click the \"File Selection\" icon in the field to open a window that allows the selection of an input file and then a personal or common-use data and the desired file in the file list. The path for the output file is automatically a sub path of the project in setting the input data. Finally, the analysis pipeline is executed with a message that the analysis has started. The status of the project is displayed on a real-time basis in three modes: Complete, Execute, and Wait.\nUsers can see the results files by clicking the \"Result\" icon on the menu bar and downloading them to the local computer by clicking the \"Download\" button in the bottom menu, which allows KoDS to be used for high-speed transmission. Closha also allows users to view files in various formats, including text, HTML, and PNG on the screen without having to download the files (Fig. 5).\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 5. Screenshot of Closha results files. Closha allows users to view files in various formats, including text, HTML, and PNG on the web without having to download the files.\n\n\n\nDiscussion \nThe Closha computing service is an attractive, efficient and potentially cost-effective alternative for the analysis of large genomic datasets. Closha offers a dynamic, economical, and versatile solution for large-scale computational analysis. Our work on genomic data demonstrate that Closha implementation provides a scalable, robust and efficient solution to address the ever-increasing demand for efficient genomic sequence analysis. Closha allows genomic researchers without informatics or programming expertise to perform complex large-scale analysis with only a web browser. Its potentials for computing with NGS genomic data could eventually revolutionize life science and medical informatics.\n\nConclusions \nWe developed a cloud-based workflow management system to provide fast and cost-effective analysis of massive genomic data. We implemented complex workflows making optimal use of high-performance computing clusters. Closha allows users to create multi-step analyses using drag and drop functionality and to modify the parameters of pipeline tools. We also developed a high-speed data transmission solution to transmit a large amount of data at a fast rate. KoDS has a file transfer speed of up to 10 times that of normal FTP and HTTP. The computer hardware for Closha is 660 CPU cores and 800 TB of disk storage, enabling 500 jobs to run at the same time. Closha is a scalable, cost-effective, and publicly available web service for large-scale genomic data analysis. Closha supports the reliable and highly scalable execution of sequencing analysis workflows in a fully automated manner. Closha provides a user-friendly interface to all genomic scientists to try to derive accurate results from NGS platform data.\n\nDeclarations \nAcknowledgments \nThe authors would like to thank the anonymous reviewers and Closha users for their time and their valuable comments.\n\nFunding \nPublication costs were funded by the KRIBB Research Initiative Program and the Korean Ministry of Science and Technology (under grant numbers 2010\u20130029345 and 2014M3C9A3064681).\n\nAbout this supplement \nThis article has been published as part of BMC Bioinformatics Volume 19 Supplement 1, 2018: Proceedings of the 28th International Conference on Genome Informatics: bioinformatics. The full contents of the supplement are available online at https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/supplements\/volume-19-supplement-1.\n\nAuthor\u2019s contributions \nGK and PK launched the Closha project and developed the cloud computing service. JY, GH, SP, and WS were responsible for development of the web interface and the back-end cloud system. BL supervised the project. GK and BL wrote the draft of the manuscript. All authors read and approved the final manuscript.\n\nCompeting interests \nThe authors declare that they have no competing interests.\n\nReferences \n\n\n\u2191 1.0 1.1 Souilmi, Y.; Lancaster, A.K.; Jung, J.Y. et al. (2015). \"Scalable and cost-effective NGS genotyping in the cloud\". BMC Medical Genomics 8: 64. doi:10.1186\/s12920-015-0134-9. PMC PMC4608296. PMID 26470712. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4608296 .   \n\n\u2191 2.0 2.1 Afgan, E.; Baker, D.; van den Beek, M. et al. (2016). \"The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update\". Nucleic Acids Research 44 (W1): W3\u2013W10. doi:10.1093\/nar\/gkw343. PMC PMC4987906. PMID 27137889. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4987906 .   \n\n\u2191 de la Garza, L.; Veit, J.; Szolek, A. et al. (2016). \"From the desktop to the grid: Scalable bioinformatics via workflow conversion\". BMC Bioinformatics 17: 127. doi:10.1186\/s12859-016-0978-9. PMC PMC4788856. PMID 26968893. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4788856 .   \n\n\u2191 Huang, Z.; Rustagi, N.; Veeraraghavan, N. et al. (2016). \"A hybrid computational strategy to address WGS variant analysis in >5000 samples\". BMC Bioinformatics 17 (1): 361. doi:10.1186\/s12859-016-1211-6. PMC PMC5018196. PMID 27612449. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5018196 .   \n\n\u2191 Goecks, J.; Eberhard, C.; Too, T. et al. (2013). \"Web-based visual analysis for high-throughput genomics\". BMC Genomics 14: 397. doi:10.1186\/1471-2164-14-397. PMC PMC3691752. PMID 23758618. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3691752 .   \n\n\u2191 Langdon, W.B. (2015). \"Performance of genetic programming optimised Bowtie2 on genome comparison and analytic testing (GCAT) benchmarks\". BioData Mining 8 (1): 1. doi:10.1186\/s13040-014-0034-0. PMC PMC4304608. PMID 25621011. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4304608 .   \n\n\u2191 7.0 7.1 Yazar, S.; Gooden, G.E.; Mackey, D.A. et al. (2014). \"Benchmarking undedicated cloud computing providers for analysis of genomic datasets\". PLoS One 9 (9): e108490. doi:10.1371\/journal.pone.0108490. PMC PMC4172764. PMID 25247298. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4172764 .   \n\n\u2191 Abouelhoda, M.; Issa, S.A.; Ghanem, M. (2012). \"Tavaxy: Integrating Taverna and Galaxy workflows with cloud computing support\". BMC Bioinformatics 13: 77. doi:10.1186\/1471-2105-13-77. PMC PMC3583125. PMID 22559942. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3583125 .   \n\n\u2191 O'Driscoll, A.; Daugelaite, J.; Sleator, R.D. (2013). \"'Big data', Hadoop and cloud computing in genomics\". Journal of Biomedical Informatics 46 (5): 774\u201381. doi:10.1016\/j.jbi.2013.07.001. PMID 23872175.   \n\n\u2191 Hiltemann, S.; Mei, H.; de Hollander, M. et al. (2014). \"CGtag: complete genomics toolkit and annotation in a cloud-based Galaxy\". Gigascience 3 (1): 1. doi:10.1186\/2047-217X-3-1. PMC PMC3905657. PMID 24460651. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3905657 .   \n\n\u2191 Goecks, Jeremey; Nekrutenko, Anton; Taylor, James; The Galaxy Team (2010). \"Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences\". Genome Biology 11 (8): R86. doi:10.1186\/gb-2010-11-8-r86. PMC PMC2945788. PMID 20738864. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2945788 .   \n\n\u2191 Oinn, T.; Addis, M.; Ferris, J. et al. (2004). \"Taverna: A tool for the composition and enactment of bioinformatics workflows\". Bioinformatics 20 (17): 3045-54. doi:10.1093\/bioinformatics\/bth361. PMID 15201187.   \n\n\u2191 Niemenmaa, M.; Kallio, A.; Schumacher, A. et al. (2012). \"Hadoop-BAM: Directly manipulating next generation sequencing data in the cloud\". Bioinformatics 28 (6): 876-7. doi:10.1093\/bioinformatics\/bts054. PMC PMC3307120. PMID 22302568. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3307120 .   \n\n\u2191 Zhao, S.; Prenger, K.; Smith, L. et al. (2013). \"Rainbow: A tool for large-scale whole-genome sequencing data analysis using cloud computing\". BMC Genomics 14: 425. doi:10.1186\/1471-2164-14-425. PMC PMC3698007. PMID 23802613. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3698007 .   \n\n\u2191 Gurtowski, J.; Schatz, M.C.; Langmead, B. (2012). \"Genotyping in the cloud with Crossbow\". Current Protocols in Bioinformatics 39 (Unit 15.3): 15.3.1\u201315.3.15. doi:10.1002\/0471250953.bi1503s39. PMC PMC3465669. PMID 22948728. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3465669 .   \n\n\u2191 Nagasaki, H.; Mochizuki, T.; Kodama, Y. et al. (2013). \"DDBJ read annotation pipeline: A cloud computing-based pipeline for high-throughput analysis of next-generation sequencing data\". DNA Research 20 (4): 383-90. doi:10.1093\/dnares\/dst017. PMC PMC3738164. PMID 23657089. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3738164 .   \n\n\u2191 Kim, D.; Pertea, G.; Trapnell, C. et al. (2013). \"TopHat2: Accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions\". Genome Biology 14 (4): R36. doi:10.1186\/gb-2013-14-4-r36. PMC PMC4053844. PMID 23618408. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4053844 .   \n\n\u2191 Law, C.W.; Alhamdoosh, M.; Su, S. et al. (2016). \"RNA-seq analysis is easy as 1-2-3 with limma, Glimma and edgeR\". F1000Research 5: 1408. doi:10.12688\/f1000research.9005.2. PMC PMC4937821. PMID 27441086. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4937821 .   \n\n\nNotes \nThis presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added.\n\n\n\n\n\n\nSource: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Closha:_Bioinformatics_workflow_system_for_the_analysis_of_massive_sequencing_data\">https:\/\/www.limswiki.org\/index.php\/Journal:Closha:_Bioinformatics_workflow_system_for_the_analysis_of_massive_sequencing_data<\/a>\n\t\t\t\t\tCategories: LIMSwiki journal articles (added in 2018)LIMSwiki journal articles (all)LIMSwiki journal articles on bioinformatics\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\n\t\t\n\t\t\tNavigation menu\n\t\t\t\t\t\n\t\t\tViews\n\n\t\t\t\n\t\t\t\t\n\t\t\t\tJournal\n\t\t\t\tDiscussion\n\t\t\t\tView source\n\t\t\t\tHistory\n\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\t\n\t\t\t\tPersonal tools\n\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\t\t\tLog in\n\t\t\t\t\t\t\t\t\t\t\t\t\tRequest account\n\t\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\t\n\t\tNavigation\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tMain page\n\t\t\t\t\t\t\t\t\t\t\tRecent changes\n\t\t\t\t\t\t\t\t\t\t\tRandom page\n\t\t\t\t\t\t\t\t\t\t\tHelp\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tSearch\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t \n\t\t\t\t\t\t\n\t\t\t\t\n\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tTools\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tWhat links here\n\t\t\t\t\t\t\t\t\t\t\tRelated changes\n\t\t\t\t\t\t\t\t\t\t\tSpecial pages\n\t\t\t\t\t\t\t\t\t\t\tPermanent link\n\t\t\t\t\t\t\t\t\t\t\tPage information\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\tPrint\/export\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tCreate a book\n\t\t\t\t\t\t\t\t\t\t\tDownload as PDF\n\t\t\t\t\t\t\t\t\t\t\tDownload as Plain text\n\t\t\t\t\t\t\t\t\t\t\tPrintable version\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\n\t\tSponsors\n\t\t\n\t\t\t \r\n\n\t\r\n\n\t\r\n\n\t\r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n\t\n\t\r\n\n\t\r\n\n\t\r\n\n\t\r\n\t\t\n\t\t\n\t\t\t\n\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t This page was last modified on 27 February 2018, at 20:21.\n\t\t\t\t\t\t\t\t\tThis page has been accessed 541 times.\n\t\t\t\t\t\t\t\t\tContent is available under a Creative Commons Attribution-ShareAlike 4.0 International License unless otherwise noted.\n\t\t\t\t\t\t\t\t\tPrivacy policy\n\t\t\t\t\t\t\t\t\tAbout LIMSWiki\n\t\t\t\t\t\t\t\t\tDisclaimers\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\t\n\n","684d7a3a2f6583b431b16f4884c7c07d_html":"<body class=\"mediawiki ltr sitedir-ltr ns-206 ns-subject page-Journal_Closha_Bioinformatics_workflow_system_for_the_analysis_of_massive_sequencing_data skin-monobook action-view\">\n<div id=\"rdp-ebb-globalWrapper\">\n\t\t<div id=\"rdp-ebb-column-content\">\n\t\t\t<div id=\"rdp-ebb-content\" class=\"mw-body\" role=\"main\">\n\t\t\t\t<a id=\"rdp-ebb-top\"><\/a>\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t<h1 id=\"rdp-ebb-firstHeading\" class=\"firstHeading\" lang=\"en\">Journal:Closha: Bioinformatics workflow system for the analysis of massive sequencing data<\/h1>\n\t\t\t\t\n\t\t\t\t<div id=\"rdp-ebb-bodyContent\" class=\"mw-body-content\">\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\n\t\t\t\t\t<!-- start content -->\n\t\t\t\t\t<div id=\"rdp-ebb-mw-content-text\" lang=\"en\" dir=\"ltr\" class=\"mw-content-ltr\">\n\n\n<h2><span class=\"mw-headline\" id=\"Abstract\">Abstract<\/span><\/h2>\n<p><b>Background<\/b>: While next-generation sequencing (NGS) costs have fallen in recent years, the cost and complexity of computation remain substantial obstacles to the use of NGS in bio-medical care and <a href=\"https:\/\/www.limswiki.org\/index.php\/Genomics\" title=\"Genomics\" target=\"_blank\" class=\"wiki-link\" data-key=\"96a82dabf51cf9510dd00c5a03396c44\">genomic<\/a> research. The rapidly increasing amounts of data available from the new high-throughput methods have made data processing infeasible without automated pipelines. The integration of data and analytic resources into workflow systems provides a solution to the problem by simplifying the task of data analysis.\n<\/p><p><b>Results<\/b>: To address this challenge, we developed a cloud-based workflow management system, Closha, to provide fast and cost-effective analysis of massive genomic data. We implemented complex workflows making optimal use of high-performance computing clusters. Closha allows users to create multi-step analyses using drag-and-drop functionality and to modify the parameters of pipeline tools. Users can also import <a href=\"https:\/\/www.limswiki.org\/index.php\/Galaxy_(biomedical_software)\" title=\"Galaxy (biomedical software)\" target=\"_blank\" class=\"wiki-link\" data-key=\"ead5d6ebaa8d67744d2f68d454d89ce6\">Galaxy<\/a> pipelines into Closha. Closha is a hybrid system that enables users to use both analysis programs providing traditional tools and MapReduce-based big data analysis programs simultaneously in a single pipeline. Thus, the execution of analytics algorithms can be parallelized, speeding up the whole process. We also developed a high-speed data transmission solution, KoDS, to transmit a large amount of data at a fast rate. KoDS has a file transfer speed of up to 10 times that of normal FTP and HTTP. The computer hardware for Closha is 660 CPU cores and 800 TB of disk storage, enabling 500 jobs to run at the same time.\n<\/p><p><b>Conclusions<\/b>: Closha is a scalable, cost-effective, and publicly available web service for large-scale genomic data analysis. Closha supports the reliable and highly scalable execution of <a href=\"https:\/\/www.limswiki.org\/index.php\/Sequencing\" title=\"Sequencing\" class=\"mw-disambig wiki-link\" target=\"_blank\" data-key=\"e36167a9eb152ca16a0c4c4e6d13f323\">sequencing<\/a> analysis workflows in a fully automated manner. Closha provides a user-friendly interface to all genomic scientists to try to derive accurate results from NGS platform data. The Closha cloud server is freely available for use from <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/closha.kobic.re.kr\/\" target=\"_blank\">http:\/\/closha.kobic.re.kr\/<\/a>.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Background\">Background<\/span><\/h2>\n<p>With the emergence of next-generation sequencing (NGS) technology in 2005, the field of genomics is caught in a data deluge. Modern sequencing platforms are capable of sequencing approximately 5000 M-bases per day.<sup id=\"rdp-ebb-cite_ref-SouilmiScalable15_1-0\" class=\"reference\"><a href=\"#cite_note-SouilmiScalable15-1\" rel=\"external_link\">[1]<\/a><\/sup> DNA sequencing is becoming faster and less expensive at a pace far outstripping Moore\u2019s law, which describes the rate at which computing becomes faster and less expensive. As a result of the increased efficiency and diminished cost of NGS, the demand for clinical and agricultural applications is rapidly increasing.<sup id=\"rdp-ebb-cite_ref-AfganTheGal16_2-0\" class=\"reference\"><a href=\"#cite_note-AfganTheGal16-2\" rel=\"external_link\">[2]<\/a><\/sup> In the <a href=\"https:\/\/www.limswiki.org\/index.php\/Bioinformatics\" title=\"Bioinformatics\" target=\"_blank\" class=\"wiki-link\" data-key=\"8f506695fdbb26e3f314da308f8c053b\">bioinformatics<\/a> community, acquiring massive sequencing data is always followed by large-scale computational analysis to process the data and obtain scientific insights. Therefore, investment in a sequencing instrument would normally be accompanied by substantial investment in computer hardware, analysis pipelines, and bioinformatics experts to analyze the data.<sup id=\"rdp-ebb-cite_ref-DeLaGarzaFromThe16_3-0\" class=\"reference\"><a href=\"#cite_note-DeLaGarzaFromThe16-3\" rel=\"external_link\">[3]<\/a><\/sup>\n<\/p><p>When genomic datasets were small, they could be analyzed on personal computers in a few hours or perhaps overnight.<sup id=\"rdp-ebb-cite_ref-HuangAHybrid16_4-0\" class=\"reference\"><a href=\"#cite_note-HuangAHybrid16-4\" rel=\"external_link\">[4]<\/a><\/sup> However, this approach does not apply to large NGS datasets. Instead, researchers require high-performance computers and parallel algorithms to analyze their big genomic data in a timely manner.<sup id=\"rdp-ebb-cite_ref-GoecksWeb13_5-0\" class=\"reference\"><a href=\"#cite_note-GoecksWeb13-5\" rel=\"external_link\">[5]<\/a><\/sup> While high-performance computing is essential for data analysis, only a small number of biomedical research labs are equipped to make effective and successful use of parallel computers.<sup id=\"rdp-ebb-cite_ref-LangdonPerform15_6-0\" class=\"reference\"><a href=\"#cite_note-LangdonPerform15-6\" rel=\"external_link\">[6]<\/a><\/sup> Obstacles include the complexities inherent in managing large NGS datasets and assembling and configuring multi-step genome sequencing pipelines, as well as the difficulties inherent in adapting pipelines to process NGS data on parallel computers.<sup id=\"rdp-ebb-cite_ref-YazarBench14_7-0\" class=\"reference\"><a href=\"#cite_note-YazarBench14-7\" rel=\"external_link\">[7]<\/a><\/sup>\n<\/p><p>The difficulties in creating these complicated computational pipelines, installing and maintaining software packages, and obtaining sufficient computational resources tend to overwhelm bench biologists and prevent them from attempting to analyze their own genomic data.<sup id=\"rdp-ebb-cite_ref-AbouelhodaTavaxy12_8-0\" class=\"reference\"><a href=\"#cite_note-AbouelhodaTavaxy12-8\" rel=\"external_link\">[8]<\/a><\/sup> Despite the availability of a vast set of computational tools and methods for genomic data analysis<sup id=\"rdp-ebb-cite_ref-SouilmiScalable15_1-1\" class=\"reference\"><a href=\"#cite_note-SouilmiScalable15-1\" rel=\"external_link\">[1]<\/a><\/sup>, it is still challenging for a genomic researcher to organize these tools, integrate them into workable pipelines, find accessible computational platforms, configure the computing environment, and perform the actual analysis.\n<\/p><p>To address these challenges, the MapReduce<sup id=\"rdp-ebb-cite_ref-ODriscollBig13_9-0\" class=\"reference\"><a href=\"#cite_note-ODriscollBig13-9\" rel=\"external_link\">[9]<\/a><\/sup> model and the corresponding Apache Hadoop framework have been widely adopted to handle large data sets using parallel processing tools.<sup id=\"rdp-ebb-cite_ref-HiltemannCGtag14_10-0\" class=\"reference\"><a href=\"#cite_note-HiltemannCGtag14-10\" rel=\"external_link\">[10]<\/a><\/sup> The most widely used open-source implementation of the MapReduce programming model for big data batch processing is Apache Hadoop. A cloud-based bioinformatics workflow platform has also been proposed for genomic researchers. Scientific workflow systems such as Galaxy<sup id=\"rdp-ebb-cite_ref-GoecksGal10_11-0\" class=\"reference\"><a href=\"#cite_note-GoecksGal10-11\" rel=\"external_link\">[11]<\/a><\/sup> and Taverna<sup id=\"rdp-ebb-cite_ref-OinnTaverna04_12-0\" class=\"reference\"><a href=\"#cite_note-OinnTaverna04-12\" rel=\"external_link\">[12]<\/a><\/sup> offer simple web-based workflow toolkits and scalable computing environments to meet this challenge.\n<\/p><p>Such efforts have resulted in significant insight into the technical requirements to leverage cloud computing for the analysis of genomic data<sup id=\"rdp-ebb-cite_ref-YazarBench14_7-1\" class=\"reference\"><a href=\"#cite_note-YazarBench14-7\" rel=\"external_link\">[7]<\/a><\/sup>, but problems still remain to be solved. Even though many applications have been developed for the analysis of genomic data, they are either tools running only on a MapReduce platform such as Hadoop BAM<sup id=\"rdp-ebb-cite_ref-NiemenmaaHadoop12_13-0\" class=\"reference\"><a href=\"#cite_note-NiemenmaaHadoop12-13\" rel=\"external_link\">[13]<\/a><\/sup> or Crossbow<sup id=\"rdp-ebb-cite_ref-ZhaoRainbow13_14-0\" class=\"reference\"><a href=\"#cite_note-ZhaoRainbow13-14\" rel=\"external_link\">[14]<\/a><\/sup> or general-purpose (mainly Linux-based) programs such as bowtie<sup id=\"rdp-ebb-cite_ref-AfganTheGal16_2-1\" class=\"reference\"><a href=\"#cite_note-AfganTheGal16-2\" rel=\"external_link\">[2]<\/a><\/sup> and bwa.<sup id=\"rdp-ebb-cite_ref-GurtowskiGeno12_15-0\" class=\"reference\"><a href=\"#cite_note-GurtowskiGeno12-15\" rel=\"external_link\">[15]<\/a><\/sup> It is crucial to integrate these two types of platform-based applications on a single pipeline. Transferring these big data is another problem, as NGS genomic data is too large to use cloud computing platform services.<sup id=\"rdp-ebb-cite_ref-NagasakiDDBJ13_16-0\" class=\"reference\"><a href=\"#cite_note-NagasakiDDBJ13-16\" rel=\"external_link\">[16]<\/a><\/sup>\n<\/p><p>We developed an automatic workflow management system, Closha, to provide a pipeline-based analysis service for massive biological data, especially NGS genomic data. Closha was developed as a hybrid system that can run both Hadoop-based and general-purpose applications on a single analysis pipeline. We also developed a high-speed data transmission solution, KoDS, to transmit a large amount of data at a fast rate. Closha makes it simple to create multi-step analysis using a simple drag-and-drop functionality. Using Closha, programs can be added and connected to each other so that the output of one program becomes the input of other programs. Our cloud-based workflow management system can help users to run in-house pipelines or construct a series of steps in an organized way.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Methods\">Methods<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"Goals_of_Closha\">Goals of Closha<\/span><\/h3>\n<p>The following three objectives drive the development of Closha. First, Closha seeks to increase access to intricate computational analyses for all genomic researchers, including those with limited or no programming knowledge. Our web-based graphical user interface (GUI) makes it simple to do everything needed for relatively large data analyses. Second, the Closha GUI provides a workflow editor in which users can simply create automated, multi-step analysis pipelines using drag-and-drop. Here, workflows refer to structured procedures that help users construct a series of steps in an organized way. Each step is a specific parametrized action that receives input and produces output. The analysis pipelines on Closha are exactly reproducible, and all analysis parameters and inputs are permanently recorded. Lastly, Closha enables users to share their pipelines on the web.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Cluster_configuration\">Cluster configuration<\/span><\/h3>\n<p>All runs of analysis pipelines on Closha are performed on a cluster of five master nodes and 33 data (slave) nodes (Fig. 1). The Closha hardware system consists of 660 core CPUs, 2 TB of memory, and 800 TB of disk storage in total. Each node has an Intel Xeon E502690 v2 3.0 GHz CPU, 96 GB of memory, and 28 TB of disk storage. The data node HDD configuration consists of the Hadoop Distributed File System (HDFS) and a solid state drive (SSD) cache. HDFS is the primary distributed storage used by Hadoop applications. An SSD is a flash-based storage drive that is many times faster than a traditional hard drive, so using an SSD in the data node makes it possible to run Linux-based programs on the Hadoop cluster system. Edge nodes (gateway nodes) are the interface between the Hadoop cluster and the outside network. The edge nodes are commonly used to run client applications and cluster administration tools. The node manager (NM) handles the individual data nodes in a Hadoop cluster.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig1_Closha_BMCBioinfo2018_19-Sup1.gif\" class=\"image wiki-link\" target=\"_blank\" data-key=\"a03197c39a75037346874c626594e4ae\"><img alt=\"Fig1 Closha BMCBioinfo2018 19-Sup1.gif\" src=\"https:\/\/www.limswiki.org\/images\/8\/8a\/Fig1_Closha_BMCBioinfo2018_19-Sup1.gif\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 1.<\/b> The architecture of the Closha system. The Closha system consists of distributed computing nodes: the master node (name node), slave nodes (data node), edge nodes, and node manager.<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h3><span class=\"mw-headline\" id=\"Closha_workspace\">Closha workspace<\/span><\/h3>\n<p>The Closha GUI workspace is divided into eight panels that show information on the user\u2019s projects, the file explorer, the pipeline modelling screen (canvas), the analysis programs (program panel), the analysis program parameters, the analysis pipeline list (pipeline panel), the list of analysis programs available for use, and the job execution history and current progress (execution and history panel) (Fig. 2).\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig2_Closha_BMCBioinfo2018_19-Sup1.gif\" class=\"image wiki-link\" target=\"_blank\" data-key=\"1e3cd8f082286aa3c4a0cc33391a561d\"><img alt=\"Fig2 Closha BMCBioinfo2018 19-Sup1.gif\" src=\"https:\/\/www.limswiki.org\/images\/a\/a7\/Fig2_Closha_BMCBioinfo2018_19-Sup1.gif\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 2.<\/b> The interface of the Closha workspace. The web-based Closha workflow editor has several panels: (<b>a<\/b>) the pipeline project list, (<b>b<\/b>) the file explorer, (<b>c<\/b>) the canvas: pipeline modeling screen, (<b>d<\/b>) a table detailing the analysis program, (<b>e<\/b>) a table detailing the analysis program parameters, (<b>f<\/b>) the analysis pipeline list, (<b>g<\/b>) the list of analysis programs available for use, and (<b>h<\/b>) the pipeline project job execution history and current progress.<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>Analysis pipelines are grouped into categories and can be searched on the pipeline panel. When a pipeline is selected, it is shown in the main window, where its parameters are set and the tool is executed. When a user executes a tool, its output datasets are added to the execution and history panel. The colors on the execution panel shows the state of tool execution. Clicking on a dataset in the panel provides a wealth of information, including the tool and parameter settings used to create it.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Workflow_editor_.28canvas.29\">Workflow editor (canvas)<\/span><\/h3>\n<p>The canvas is an interface for creating and modifying workflows (analysis pipelines) by arranging and connecting activities to drive processes. The canvas provides the working surface for creating new workflows or editing existing ones. Users can create custom workflows or use existing workflows on the screen. The canvas (Fig. 2) makes it simple to create multi-step analyses using drag-and-drop functionality. Using the canvas, existing and user-uploaded tools can be added and connected so that the output of one tool becomes the input of other tools. Tool parameters can be set in the parameter panel. Workflows enable the automation and repeated running of large analyses. Once created, workflows function as tools. They can be accessed and run from Closha\u2019s main analysis interface.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Representing_analysis_pipelines_of_workflows\">Representing analysis pipelines of workflows<\/span><\/h3>\n<p>The workflows in the analysis pipelines are commonly depicted as directed acyclical graphs, in which each of the vertices (modules or programs) has a unique identifier and represents a task to be performed. Additionally, each of the tasks in a workflow can receive inputs and can produce outputs. The outputs of a task can be directed through another task as input. An edge (connector) between two vertices represents the channeling of an output from one task into another. Edges determine the logical sequence. A task can be executed once all of its inputs can be resolved.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Uploading_data_to_Closha\">Uploading data to Closha<\/span><\/h3>\n<p>We developed a fast file transfer tool, called KoDS, for uploading massive genomic data such as exome and RNA-Seq (RNA sequencing) data to the Closha server from the user\u2019s local computer and for downloading the resulting files to the local computer (Fig. 3). The client program of KoDS can be downloaded from the Closha website and be installed on the user\u2019s computer. The KoDS transfer platform provides users with secure high-speed movement of all of their data, supporting a wide range of server, desktop, and Linux operating systems. Using KoDS, users can simultaneously upload an unlimited number of files to Closha. KoDS has a file transfer speed up to 10 times that of normal FTP and HTTP protocols.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig3_Closha_BMCBioinfo2018_19-Sup1.gif\" class=\"image wiki-link\" target=\"_blank\" data-key=\"e12e441e25850cd6fefb92f35f3574aa\"><img alt=\"Fig3 Closha BMCBioinfo2018 19-Sup1.gif\" src=\"https:\/\/www.limswiki.org\/images\/9\/98\/Fig3_Closha_BMCBioinfo2018_19-Sup1.gif\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 3.<\/b> Screenshot of the KoDS tool. The left window is the user\u2019s local computer and the right is the Closha server.<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h3><span class=\"mw-headline\" id=\"Hybrid_system\">Hybrid system<\/span><\/h3>\n<p>We implemented a service-oriented architecture, a hybrid system, to allow arbitrary tools to be described as services. The hybrid system provides access to traditional applications on a cloud infrastructure, which enables users to use both the MapReduce tools and the traditional programs in a single pipeline simultaneously. Thus, the execution of analytical algorithms can be parallelized, speeding up the whole process.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Elastic_scalability\">Elastic scalability<\/span><\/h3>\n<p>Scalability is the capability of a system, network, or process to handle a growing amount of work or its potential to be enlarged to accommodate that growth. For example, a system is considered scalable if it can increase its total output under an increased load when resources (typically hardware) are added. A system whose performance improves after adding hardware, in proportion to the capacity added, is said to be a scalable system. Scalability is one of the most attractive prospects of the benefit-rich phenomenon of cloud computing and provides a useful safety net for when a user\u2019s needs and demands change. The resource manager and the job controller on Closha elastically control the scalability by either increasing or decreasing the required resources.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Results\">Results<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"Analysis_pipelines\">Analysis pipelines<\/span><\/h3>\n<p>As of October 1st, 2017, approximately 200 analysis tools were installed on Closha, and 20 analysis pipelines were available for the analysis of exome, RNA-Seq, and ChiP-Seq, data, among others. Closha has two types of pipelines: registered and new. Users can use a registered pipeline suitable for their genomic data by selecting a pipeline in the Closha analysis pipeline list. If users want to create a new analysis pipeline, they can build their own pipeline either from scratch or by modifying a registered pipeline with installed or user-defined tools.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"RNA-Seq_pipeline\">RNA-Seq pipeline<\/span><\/h3>\n<p>We use a representative NGS analysis workflow of RNA-Seq to examine the time and cost of execution on Closha cloud configurations. RNA-Seq is a deep-sequencing technique used to explore and profile the entire transcriptome of any organism. Analyzing an organism\u2019s transcriptome is important for understanding the functional elements of a genome. We built an RNA-Seq analysis pipeline in which we use the KoDS tool to move data from a local machine to the Closha server. Figure 4a shows a schematic overview of the RNA-Seq pipeline. Then, we can obtain the resulting output data at the end of the pipeline.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig4_Closha_BMCBioinfo2018_19-Sup1.gif\" class=\"image wiki-link\" target=\"_blank\" data-key=\"dba0bdc73a0061744059c93da2ea09e9\"><img alt=\"Fig4 Closha BMCBioinfo2018 19-Sup1.gif\" src=\"https:\/\/www.limswiki.org\/images\/f\/fd\/Fig4_Closha_BMCBioinfo2018_19-Sup1.gif\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 4.<\/b> Screenshot of the RNA-Seq schematic diagram and its pipeline. (<b>a<\/b>) Schematic overview of the RNA-Seq pipeline. (<b>b<\/b>) The RNA-Seq pipeline implemented on the Closha canvas<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>The pipeline includes five analysis tools: TopHat<sup id=\"rdp-ebb-cite_ref-KimTopHat13_17-0\" class=\"reference\"><a href=\"#cite_note-KimTopHat13-17\" rel=\"external_link\">[17]<\/a><\/sup>, Cufflinks, Cuffmerge, Cuffdiff, and limma voom.<sup id=\"rdp-ebb-cite_ref-LawRNA16_18-0\" class=\"reference\"><a href=\"#cite_note-LawRNA16-18\" rel=\"external_link\">[18]<\/a><\/sup> TopHat is a fast splice junction mapper that is used to align RNA-Seq reads to large genomes and analyze the mapping results to identify splicing junctions between exons. TopHat internally uses the Bowtie tool, an ultra-high-throughput short read aligner. Cufflinks is used to assemble these alignments into a parsimonious set of transcripts and then estimate the relative abundances of these transcripts. The main purpose of Cuffmerge is to merge several Cufflinks assemblies, making it easier to produce an assembly GTF file suitable for use with Cuffdiff. Cuffdiff is then used to find significant changes in transcript expression, splicing, and promoter use. Finally, voom robustly estimates the mean-variance relationship and generates a precision weight for each individual normalized observation. It can be used to calculate differently expressed genes (DEGs) from the transcript expression level. Figure 4b depicts the implemented RNA-Seq pipeline on the Closha canvas.\n<\/p><p>To evaluate the execution of the RNA-Seq pipeline in Closha, we used an RNA-Seq case-control sample data set: 42,112,235 paired-end case reads and 40,975,645 paired-end control reads. The total sample size of the case and the control reads is approximately 42GB. The execution of the RNA-Seq DEG pipeline on the case and the control data provided the baseline runtime speed. Closha assigned four CPU cores and 16GB of memory for a single RNA-Seq job. The execution of the RNA-Seq pipeline on the sample data using Closha took a total of three hours and 44 minutes, and most of the time was spent on the running of the TopHat2 program (two hours and 36 minutes). We performed a comparison experiment between Closha and Galaxy with the same data and the same RNA-Seq pipeline. The same machine was used for the comparison. The execution time using Galaxy was six hours and 11 minutes, showing Closha has approximately 1.7 times better performance than Galaxy in the execution of the RNA-Seq pipeline (Table 1).\n<\/p>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table class=\"wikitable\" border=\"1\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" colspan=\"3\"><b>Table 1.<\/b> Execution time of each program of Closha and Galaxy in the RNA-Seq analysis\n<\/td><\/tr>\n<tr>\n<th style=\"padding-left:10px; padding-right:10px;\" rowspan=\"2\">Analysis steps (programs)\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\" colspan=\"2\">Running time\n<\/th><\/tr>\n<tr>\n<th style=\"padding-left:10px; padding-right:10px;\">Closha\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">Galaxy\n<\/th><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Data transfer\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Nine minutes\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">One hour and 14 minutes\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">FastQC\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Three minutes\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Five minutes\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Sickle\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Three minutes\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">11 minutes\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">TopHat2\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Two hours and 36 minutes\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Three hours and four minutes\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">SAMtools\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">13 minutes\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">15 minutes\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Cufflinks\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">10 minutes\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">16 minutes\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Cuffdiff and voom\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">30 minutes\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">One hour and 6 minutes\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Total running time: Three hours and 44 minutes\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Total running time: Six hours and 11 minutes\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>To simulate real use with multiple executions, we performed batched jobs of the example data simultaneously, scaling up by adding 100 jobs of the sample data. We found little change in execution time as the number of batched jobs increased, which means that the Closha cloud system can run an RNA-Seq pipeline of up to 500 jobs at the same time with little change in execution time (Table 2).\n<\/p>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table class=\"wikitable\" border=\"1\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" colspan=\"6\"><b>Table 2.<\/b> Running time of multiple jobs\n<\/td><\/tr>\n<tr>\n<th style=\"padding-left:10px; padding-right:10px;\">Number of jobs\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">100\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">200\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">300\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">400\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">500\n<\/th><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Running time of each job\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Three minutes and 44 seconds\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Three minutes and 59 seconds\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Three minutes and 53 seconds\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Three minutes and 42 seconds\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Three minutes and 58 seconds\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h3><span class=\"mw-headline\" id=\"Creating_a_new_pipeline\">Creating a new pipeline<\/span><\/h3>\n<p>Closha allows users to create their own pipelines to analyze their own data on the canvas. To create a new analysis pipeline, users click the \u2018New Pipeline\u2019 button in the top menu of Closha, enter the name and description of the pipeline, and select an analysis pipeline type. Users will have only the [Start] and [End] modules on the canvas immediately upon creating a pipeline after selecting a \"New analysis pipeline design\" in the project type. Users can drag and drop their desired analysis programs in the list of analysis programs on the right of the canvas. Upon positioning a desired analysis program on the canvas, when the users places the mouse over the edge of the analysis program icon, a connection mark will be created that can be drawn to the module. Starting from the mark, the connector must be dragged until the icon of the next analysis program to be connected turns translucent. Users can make connections to the start module, the analysis program and the end module using this method to perform the analysis.\n<\/p><p>Then, users can set the parameter values by clicking the \"Set Parameters\" button on the toolbar before executing the pipeline project. On the creation of an initial project, default parameter values are automatically assigned. Users can change the parameter values in accordance with the conditions required to set and analyze their input data. To connect user files to Closha, the user can click the \"File Selection\" icon in the field to open a window that allows the selection of an input file and then a personal or common-use data and the desired file in the file list. The path for the output file is automatically a sub path of the project in setting the input data. Finally, the analysis pipeline is executed with a message that the analysis has started. The status of the project is displayed on a real-time basis in three modes: Complete, Execute, and Wait.\n<\/p><p>Users can see the results files by clicking the \"Result\" icon on the menu bar and downloading them to the local computer by clicking the \"Download\" button in the bottom menu, which allows KoDS to be used for high-speed transmission. Closha also allows users to view files in various formats, including text, HTML, and PNG on the screen without having to download the files (Fig. 5).\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig5_Closha_BMCBioinfo2018_19-Sup1.gif\" class=\"image wiki-link\" target=\"_blank\" data-key=\"90e6df3d35164db7c179f9837768f89e\"><img alt=\"Fig5 Closha BMCBioinfo2018 19-Sup1.gif\" src=\"https:\/\/www.limswiki.org\/images\/b\/b8\/Fig5_Closha_BMCBioinfo2018_19-Sup1.gif\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 5.<\/b> Screenshot of Closha results files. Closha allows users to view files in various formats, including text, HTML, and PNG on the web without having to download the files.<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h2><span class=\"mw-headline\" id=\"Discussion\">Discussion<\/span><\/h2>\n<p>The Closha computing service is an attractive, efficient and potentially cost-effective alternative for the analysis of large genomic datasets. Closha offers a dynamic, economical, and versatile solution for large-scale computational analysis. Our work on genomic data demonstrate that Closha implementation provides a scalable, robust and efficient solution to address the ever-increasing demand for efficient genomic sequence analysis. Closha allows genomic researchers without <a href=\"https:\/\/www.limswiki.org\/index.php\/Informatics\" title=\"Informatics\" class=\"mw-disambig wiki-link\" target=\"_blank\" data-key=\"ea0ff624ac3a644c35d2b51d39047bdf\">informatics<\/a> or programming expertise to perform complex large-scale analysis with only a web browser. Its potentials for computing with NGS genomic data could eventually revolutionize life science and <a href=\"https:\/\/www.limswiki.org\/index.php\/Medical_informatics\" title=\"Medical informatics\" class=\"mw-redirect wiki-link\" target=\"_blank\" data-key=\"f89ecb3b26617b8c6e09bc5e050cfd5d\">medical informatics<\/a>.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Conclusions\">Conclusions<\/span><\/h2>\n<p>We developed a cloud-based workflow management system to provide fast and cost-effective analysis of massive genomic data. We implemented complex workflows making optimal use of high-performance computing clusters. Closha allows users to create multi-step analyses using drag and drop functionality and to modify the parameters of pipeline tools. We also developed a high-speed data transmission solution to transmit a large amount of data at a fast rate. KoDS has a file transfer speed of up to 10 times that of normal FTP and HTTP. The computer hardware for Closha is 660 CPU cores and 800 TB of disk storage, enabling 500 jobs to run at the same time. Closha is a scalable, cost-effective, and publicly available web service for large-scale genomic data analysis. Closha supports the reliable and highly scalable execution of sequencing analysis workflows in a fully automated manner. Closha provides a user-friendly interface to all genomic scientists to try to derive accurate results from NGS platform data.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Declarations\">Declarations<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"Acknowledgments\">Acknowledgments<\/span><\/h3>\n<p>The authors would like to thank the anonymous reviewers and Closha users for their time and their valuable comments.\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Funding\">Funding<\/span><\/h4>\n<p>Publication costs were funded by the KRIBB Research Initiative Program and the Korean Ministry of Science and Technology (under grant numbers 2010\u20130029345 and 2014M3C9A3064681).\n<\/p>\n<h4><span class=\"mw-headline\" id=\"About_this_supplement\">About this supplement<\/span><\/h4>\n<p>This article has been published as part of BMC Bioinformatics Volume 19 Supplement 1, 2018: Proceedings of the 28th International Conference on Genome Informatics: bioinformatics. The full contents of the supplement are available online at <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/supplements\/volume-19-supplement-1\" target=\"_blank\">https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/supplements\/volume-19-supplement-1<\/a>.\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Author.E2.80.99s_contributions\">Author\u2019s contributions<\/span><\/h4>\n<p>GK and PK launched the Closha project and developed the cloud computing service. JY, GH, SP, and WS were responsible for development of the web interface and the back-end cloud system. BL supervised the project. GK and BL wrote the draft of the manuscript. All authors read and approved the final manuscript.\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Competing_interests\">Competing interests<\/span><\/h4>\n<p>The authors declare that they have no competing interests.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"References\">References<\/span><\/h2>\n<div class=\"reflist references-column-width\" style=\"-moz-column-width: 30em; -webkit-column-width: 30em; column-width: 30em; list-style-type: decimal;\">\n<ol class=\"references\">\n<li id=\"cite_note-SouilmiScalable15-1\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-SouilmiScalable15_1-0\" rel=\"external_link\">1.0<\/a><\/sup> <sup><a href=\"#cite_ref-SouilmiScalable15_1-1\" rel=\"external_link\">1.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Souilmi, Y.; Lancaster, A.K.; Jung, J.Y. et al. (2015). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4608296\" target=\"_blank\">\"Scalable and cost-effective NGS genotyping in the cloud\"<\/a>. <i>BMC Medical Genomics<\/i> <b>8<\/b>: 64. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1186%2Fs12920-015-0134-9\" target=\"_blank\">10.1186\/s12920-015-0134-9<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4608296\/\" target=\"_blank\">PMC4608296<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/26470712\" target=\"_blank\">26470712<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4608296\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4608296<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Scalable+and+cost-effective+NGS+genotyping+in+the+cloud&rft.jtitle=BMC+Medical+Genomics&rft.aulast=Souilmi%2C+Y.%3B+Lancaster%2C+A.K.%3B+Jung%2C+J.Y.+et+al.&rft.au=Souilmi%2C+Y.%3B+Lancaster%2C+A.K.%3B+Jung%2C+J.Y.+et+al.&rft.date=2015&rft.volume=8&rft.pages=64&rft_id=info:doi\/10.1186%2Fs12920-015-0134-9&rft_id=info:pmc\/PMC4608296&rft_id=info:pmid\/26470712&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4608296&rfr_id=info:sid\/en.wikipedia.org:Journal:Closha:_Bioinformatics_workflow_system_for_the_analysis_of_massive_sequencing_data\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-AfganTheGal16-2\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-AfganTheGal16_2-0\" rel=\"external_link\">2.0<\/a><\/sup> <sup><a href=\"#cite_ref-AfganTheGal16_2-1\" rel=\"external_link\">2.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Afgan, E.; Baker, D.; van den Beek, M. et al. (2016). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4987906\" target=\"_blank\">\"The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update\"<\/a>. <i>Nucleic Acids Research<\/i> <b>44<\/b> (W1): W3\u2013W10. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1093%2Fnar%2Fgkw343\" target=\"_blank\">10.1093\/nar\/gkw343<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4987906\/\" target=\"_blank\">PMC4987906<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/27137889\" target=\"_blank\">27137889<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4987906\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4987906<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+Galaxy+platform+for+accessible%2C+reproducible+and+collaborative+biomedical+analyses%3A+2016+update&rft.jtitle=Nucleic+Acids+Research&rft.aulast=Afgan%2C+E.%3B+Baker%2C+D.%3B+van+den+Beek%2C+M.+et+al.&rft.au=Afgan%2C+E.%3B+Baker%2C+D.%3B+van+den+Beek%2C+M.+et+al.&rft.date=2016&rft.volume=44&rft.issue=W1&rft.pages=W3%E2%80%93W10&rft_id=info:doi\/10.1093%2Fnar%2Fgkw343&rft_id=info:pmc\/PMC4987906&rft_id=info:pmid\/27137889&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4987906&rfr_id=info:sid\/en.wikipedia.org:Journal:Closha:_Bioinformatics_workflow_system_for_the_analysis_of_massive_sequencing_data\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-DeLaGarzaFromThe16-3\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-DeLaGarzaFromThe16_3-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">de la Garza, L.; Veit, J.; Szolek, A. et al. (2016). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4788856\" target=\"_blank\">\"From the desktop to the grid: Scalable bioinformatics via workflow conversion\"<\/a>. <i>BMC Bioinformatics<\/i> <b>17<\/b>: 127. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1186%2Fs12859-016-0978-9\" target=\"_blank\">10.1186\/s12859-016-0978-9<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4788856\/\" target=\"_blank\">PMC4788856<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/26968893\" target=\"_blank\">26968893<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4788856\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4788856<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=From+the+desktop+to+the+grid%3A+Scalable+bioinformatics+via+workflow+conversion&rft.jtitle=BMC+Bioinformatics&rft.aulast=de+la+Garza%2C+L.%3B+Veit%2C+J.%3B+Szolek%2C+A.+et+al.&rft.au=de+la+Garza%2C+L.%3B+Veit%2C+J.%3B+Szolek%2C+A.+et+al.&rft.date=2016&rft.volume=17&rft.pages=127&rft_id=info:doi\/10.1186%2Fs12859-016-0978-9&rft_id=info:pmc\/PMC4788856&rft_id=info:pmid\/26968893&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4788856&rfr_id=info:sid\/en.wikipedia.org:Journal:Closha:_Bioinformatics_workflow_system_for_the_analysis_of_massive_sequencing_data\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HuangAHybrid16-4\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HuangAHybrid16_4-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Huang, Z.; Rustagi, N.; Veeraraghavan, N. et al. (2016). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5018196\" target=\"_blank\">\"A hybrid computational strategy to address WGS variant analysis in >5000 samples\"<\/a>. <i>BMC Bioinformatics<\/i> <b>17<\/b> (1): 361. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1186%2Fs12859-016-1211-6\" target=\"_blank\">10.1186\/s12859-016-1211-6<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC5018196\/\" target=\"_blank\">PMC5018196<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/27612449\" target=\"_blank\">27612449<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5018196\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5018196<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+hybrid+computational+strategy+to+address+WGS+variant+analysis+in+%3E5000+samples&rft.jtitle=BMC+Bioinformatics&rft.aulast=Huang%2C+Z.%3B+Rustagi%2C+N.%3B+Veeraraghavan%2C+N.+et+al.&rft.au=Huang%2C+Z.%3B+Rustagi%2C+N.%3B+Veeraraghavan%2C+N.+et+al.&rft.date=2016&rft.volume=17&rft.issue=1&rft.pages=361&rft_id=info:doi\/10.1186%2Fs12859-016-1211-6&rft_id=info:pmc\/PMC5018196&rft_id=info:pmid\/27612449&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC5018196&rfr_id=info:sid\/en.wikipedia.org:Journal:Closha:_Bioinformatics_workflow_system_for_the_analysis_of_massive_sequencing_data\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GoecksWeb13-5\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-GoecksWeb13_5-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Goecks, J.; Eberhard, C.; Too, T. et al. (2013). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3691752\" target=\"_blank\">\"Web-based visual analysis for high-throughput genomics\"<\/a>. <i>BMC Genomics<\/i> <b>14<\/b>: 397. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1186%2F1471-2164-14-397\" target=\"_blank\">10.1186\/1471-2164-14-397<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3691752\/\" target=\"_blank\">PMC3691752<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/23758618\" target=\"_blank\">23758618<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3691752\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3691752<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Web-based+visual+analysis+for+high-throughput+genomics&rft.jtitle=BMC+Genomics&rft.aulast=Goecks%2C+J.%3B+Eberhard%2C+C.%3B+Too%2C+T.+et+al.&rft.au=Goecks%2C+J.%3B+Eberhard%2C+C.%3B+Too%2C+T.+et+al.&rft.date=2013&rft.volume=14&rft.pages=397&rft_id=info:doi\/10.1186%2F1471-2164-14-397&rft_id=info:pmc\/PMC3691752&rft_id=info:pmid\/23758618&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3691752&rfr_id=info:sid\/en.wikipedia.org:Journal:Closha:_Bioinformatics_workflow_system_for_the_analysis_of_massive_sequencing_data\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LangdonPerform15-6\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-LangdonPerform15_6-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Langdon, W.B. (2015). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4304608\" target=\"_blank\">\"Performance of genetic programming optimised Bowtie2 on genome comparison and analytic testing (GCAT) benchmarks\"<\/a>. <i>BioData Mining<\/i> <b>8<\/b> (1): 1. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1186%2Fs13040-014-0034-0\" target=\"_blank\">10.1186\/s13040-014-0034-0<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4304608\/\" target=\"_blank\">PMC4304608<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/25621011\" target=\"_blank\">25621011<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4304608\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4304608<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Performance+of+genetic+programming+optimised+Bowtie2+on+genome+comparison+and+analytic+testing+%28GCAT%29+benchmarks&rft.jtitle=BioData+Mining&rft.aulast=Langdon%2C+W.B.&rft.au=Langdon%2C+W.B.&rft.date=2015&rft.volume=8&rft.issue=1&rft.pages=1&rft_id=info:doi\/10.1186%2Fs13040-014-0034-0&rft_id=info:pmc\/PMC4304608&rft_id=info:pmid\/25621011&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4304608&rfr_id=info:sid\/en.wikipedia.org:Journal:Closha:_Bioinformatics_workflow_system_for_the_analysis_of_massive_sequencing_data\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-YazarBench14-7\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-YazarBench14_7-0\" rel=\"external_link\">7.0<\/a><\/sup> <sup><a href=\"#cite_ref-YazarBench14_7-1\" rel=\"external_link\">7.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Yazar, S.; Gooden, G.E.; Mackey, D.A. et al. (2014). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4172764\" target=\"_blank\">\"Benchmarking undedicated cloud computing providers for analysis of genomic datasets\"<\/a>. <i>PLoS One<\/i> <b>9<\/b> (9): e108490. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1371%2Fjournal.pone.0108490\" target=\"_blank\">10.1371\/journal.pone.0108490<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4172764\/\" target=\"_blank\">PMC4172764<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/25247298\" target=\"_blank\">25247298<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4172764\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4172764<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Benchmarking+undedicated+cloud+computing+providers+for+analysis+of+genomic+datasets&rft.jtitle=PLoS+One&rft.aulast=Yazar%2C+S.%3B+Gooden%2C+G.E.%3B+Mackey%2C+D.A.+et+al.&rft.au=Yazar%2C+S.%3B+Gooden%2C+G.E.%3B+Mackey%2C+D.A.+et+al.&rft.date=2014&rft.volume=9&rft.issue=9&rft.pages=e108490&rft_id=info:doi\/10.1371%2Fjournal.pone.0108490&rft_id=info:pmc\/PMC4172764&rft_id=info:pmid\/25247298&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4172764&rfr_id=info:sid\/en.wikipedia.org:Journal:Closha:_Bioinformatics_workflow_system_for_the_analysis_of_massive_sequencing_data\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-AbouelhodaTavaxy12-8\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-AbouelhodaTavaxy12_8-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Abouelhoda, M.; Issa, S.A.; Ghanem, M. (2012). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3583125\" target=\"_blank\">\"Tavaxy: Integrating Taverna and Galaxy workflows with <\/a><a href=\"https:\/\/www.limswiki.org\/index.php\/Cloud_computing\" title=\"Cloud computing\" target=\"_blank\" class=\"wiki-link\" data-key=\"fcfe5882eaa018d920cedb88398b604f\">cloud computing<\/a><a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3583125\" target=\"_blank\"> support\"<\/a>. <i>BMC Bioinformatics<\/i> <b>13<\/b>: 77. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1186%2F1471-2105-13-77\" target=\"_blank\">10.1186\/1471-2105-13-77<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3583125\/\" target=\"_blank\">PMC3583125<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/22559942\" target=\"_blank\">22559942<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3583125\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3583125<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Tavaxy%3A+Integrating+Taverna+and+Galaxy+workflows+with+%5B%5Bcloud+computing%5D%5D+support&rft.jtitle=BMC+Bioinformatics&rft.aulast=Abouelhoda%2C+M.%3B+Issa%2C+S.A.%3B+Ghanem%2C+M.&rft.au=Abouelhoda%2C+M.%3B+Issa%2C+S.A.%3B+Ghanem%2C+M.&rft.date=2012&rft.volume=13&rft.pages=77&rft_id=info:doi\/10.1186%2F1471-2105-13-77&rft_id=info:pmc\/PMC3583125&rft_id=info:pmid\/22559942&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3583125&rfr_id=info:sid\/en.wikipedia.org:Journal:Closha:_Bioinformatics_workflow_system_for_the_analysis_of_massive_sequencing_data\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ODriscollBig13-9\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ODriscollBig13_9-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">O'Driscoll, A.; Daugelaite, J.; Sleator, R.D. (2013). \"'Big data', Hadoop and cloud computing in genomics\". <i>Journal of Biomedical Informatics<\/i> <b>46<\/b> (5): 774\u201381. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.jbi.2013.07.001\" target=\"_blank\">10.1016\/j.jbi.2013.07.001<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/23872175\" target=\"_blank\">23872175<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=%27Big+data%27%2C+Hadoop+and+cloud+computing+in+genomics&rft.jtitle=Journal+of+Biomedical+Informatics&rft.aulast=O%27Driscoll%2C+A.%3B+Daugelaite%2C+J.%3B+Sleator%2C+R.D.&rft.au=O%27Driscoll%2C+A.%3B+Daugelaite%2C+J.%3B+Sleator%2C+R.D.&rft.date=2013&rft.volume=46&rft.issue=5&rft.pages=774%E2%80%9381&rft_id=info:doi\/10.1016%2Fj.jbi.2013.07.001&rft_id=info:pmid\/23872175&rfr_id=info:sid\/en.wikipedia.org:Journal:Closha:_Bioinformatics_workflow_system_for_the_analysis_of_massive_sequencing_data\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HiltemannCGtag14-10\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HiltemannCGtag14_10-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Hiltemann, S.; Mei, H.; de Hollander, M. et al. (2014). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3905657\" target=\"_blank\">\"CGtag: complete genomics toolkit and annotation in a cloud-based Galaxy\"<\/a>. <i>Gigascience<\/i> <b>3<\/b> (1): 1. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1186%2F2047-217X-3-1\" target=\"_blank\">10.1186\/2047-217X-3-1<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3905657\/\" target=\"_blank\">PMC3905657<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/24460651\" target=\"_blank\">24460651<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3905657\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3905657<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=CGtag%3A+complete+genomics+toolkit+and+annotation+in+a+cloud-based+Galaxy&rft.jtitle=Gigascience&rft.aulast=Hiltemann%2C+S.%3B+Mei%2C+H.%3B+de+Hollander%2C+M.+et+al.&rft.au=Hiltemann%2C+S.%3B+Mei%2C+H.%3B+de+Hollander%2C+M.+et+al.&rft.date=2014&rft.volume=3&rft.issue=1&rft.pages=1&rft_id=info:doi\/10.1186%2F2047-217X-3-1&rft_id=info:pmc\/PMC3905657&rft_id=info:pmid\/24460651&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3905657&rfr_id=info:sid\/en.wikipedia.org:Journal:Closha:_Bioinformatics_workflow_system_for_the_analysis_of_massive_sequencing_data\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GoecksGal10-11\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-GoecksGal10_11-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Goecks, Jeremey; Nekrutenko, Anton; Taylor, James; The Galaxy Team (2010). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2945788\" target=\"_blank\">\"Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences\"<\/a>. <i>Genome Biology<\/i> <b>11<\/b> (8): R86. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1186%2Fgb-2010-11-8-r86\" target=\"_blank\">10.1186\/gb-2010-11-8-r86<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC2945788\/\" target=\"_blank\">PMC2945788<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/20738864\" target=\"_blank\">20738864<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2945788\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC2945788<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Galaxy%3A+a+comprehensive+approach+for+supporting+accessible%2C+reproducible%2C+and+transparent+computational+research+in+the+life+sciences&rft.jtitle=Genome+Biology&rft.aulast=Goecks%2C+Jeremey%3B+Nekrutenko%2C+Anton%3B+Taylor%2C+James%3B+The+Galaxy+Team&rft.au=Goecks%2C+Jeremey%3B+Nekrutenko%2C+Anton%3B+Taylor%2C+James%3B+The+Galaxy+Team&rft.date=2010&rft.volume=11&rft.issue=8&rft.pages=R86&rft_id=info:doi\/10.1186%2Fgb-2010-11-8-r86&rft_id=info:pmc\/PMC2945788&rft_id=info:pmid\/20738864&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC2945788&rfr_id=info:sid\/en.wikipedia.org:Journal:Closha:_Bioinformatics_workflow_system_for_the_analysis_of_massive_sequencing_data\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-OinnTaverna04-12\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-OinnTaverna04_12-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Oinn, T.; Addis, M.; Ferris, J. et al. (2004). \"Taverna: A tool for the composition and enactment of bioinformatics workflows\". <i>Bioinformatics<\/i> <b>20<\/b> (17): 3045-54. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1093%2Fbioinformatics%2Fbth361\" target=\"_blank\">10.1093\/bioinformatics\/bth361<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/15201187\" target=\"_blank\">15201187<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Taverna%3A+A+tool+for+the+composition+and+enactment+of+bioinformatics+workflows&rft.jtitle=Bioinformatics&rft.aulast=Oinn%2C+T.%3B+Addis%2C+M.%3B+Ferris%2C+J.+et+al.&rft.au=Oinn%2C+T.%3B+Addis%2C+M.%3B+Ferris%2C+J.+et+al.&rft.date=2004&rft.volume=20&rft.issue=17&rft.pages=3045-54&rft_id=info:doi\/10.1093%2Fbioinformatics%2Fbth361&rft_id=info:pmid\/15201187&rfr_id=info:sid\/en.wikipedia.org:Journal:Closha:_Bioinformatics_workflow_system_for_the_analysis_of_massive_sequencing_data\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-NiemenmaaHadoop12-13\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-NiemenmaaHadoop12_13-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Niemenmaa, M.; Kallio, A.; Schumacher, A. et al. (2012). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3307120\" target=\"_blank\">\"Hadoop-BAM: Directly manipulating next generation sequencing data in the cloud\"<\/a>. <i>Bioinformatics<\/i> <b>28<\/b> (6): 876-7. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1093%2Fbioinformatics%2Fbts054\" target=\"_blank\">10.1093\/bioinformatics\/bts054<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3307120\/\" target=\"_blank\">PMC3307120<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/22302568\" target=\"_blank\">22302568<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3307120\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3307120<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Hadoop-BAM%3A+Directly+manipulating+next+generation+sequencing+data+in+the+cloud&rft.jtitle=Bioinformatics&rft.aulast=Niemenmaa%2C+M.%3B+Kallio%2C+A.%3B+Schumacher%2C+A.+et+al.&rft.au=Niemenmaa%2C+M.%3B+Kallio%2C+A.%3B+Schumacher%2C+A.+et+al.&rft.date=2012&rft.volume=28&rft.issue=6&rft.pages=876-7&rft_id=info:doi\/10.1093%2Fbioinformatics%2Fbts054&rft_id=info:pmc\/PMC3307120&rft_id=info:pmid\/22302568&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3307120&rfr_id=info:sid\/en.wikipedia.org:Journal:Closha:_Bioinformatics_workflow_system_for_the_analysis_of_massive_sequencing_data\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ZhaoRainbow13-14\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ZhaoRainbow13_14-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Zhao, S.; Prenger, K.; Smith, L. et al. (2013). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3698007\" target=\"_blank\">\"Rainbow: A tool for large-scale whole-genome sequencing data analysis using cloud computing\"<\/a>. <i>BMC Genomics<\/i> <b>14<\/b>: 425. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1186%2F1471-2164-14-425\" target=\"_blank\">10.1186\/1471-2164-14-425<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3698007\/\" target=\"_blank\">PMC3698007<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/23802613\" target=\"_blank\">23802613<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3698007\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3698007<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Rainbow%3A+A+tool+for+large-scale+whole-genome+sequencing+data+analysis+using+cloud+computing&rft.jtitle=BMC+Genomics&rft.aulast=Zhao%2C+S.%3B+Prenger%2C+K.%3B+Smith%2C+L.+et+al.&rft.au=Zhao%2C+S.%3B+Prenger%2C+K.%3B+Smith%2C+L.+et+al.&rft.date=2013&rft.volume=14&rft.pages=425&rft_id=info:doi\/10.1186%2F1471-2164-14-425&rft_id=info:pmc\/PMC3698007&rft_id=info:pmid\/23802613&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3698007&rfr_id=info:sid\/en.wikipedia.org:Journal:Closha:_Bioinformatics_workflow_system_for_the_analysis_of_massive_sequencing_data\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GurtowskiGeno12-15\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-GurtowskiGeno12_15-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Gurtowski, J.; Schatz, M.C.; Langmead, B. (2012). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3465669\" target=\"_blank\">\"Genotyping in the cloud with Crossbow\"<\/a>. <i>Current Protocols in Bioinformatics<\/i> <b>39<\/b> (Unit 15.3): 15.3.1\u201315.3.15. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1002%2F0471250953.bi1503s39\" target=\"_blank\">10.1002\/0471250953.bi1503s39<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3465669\/\" target=\"_blank\">PMC3465669<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/22948728\" target=\"_blank\">22948728<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3465669\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3465669<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Genotyping+in+the+cloud+with+Crossbow&rft.jtitle=Current+Protocols+in+Bioinformatics&rft.aulast=Gurtowski%2C+J.%3B+Schatz%2C+M.C.%3B+Langmead%2C+B.&rft.au=Gurtowski%2C+J.%3B+Schatz%2C+M.C.%3B+Langmead%2C+B.&rft.date=2012&rft.volume=39&rft.issue=Unit+15.3&rft.pages=15.3.1%E2%80%9315.3.15&rft_id=info:doi\/10.1002%2F0471250953.bi1503s39&rft_id=info:pmc\/PMC3465669&rft_id=info:pmid\/22948728&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3465669&rfr_id=info:sid\/en.wikipedia.org:Journal:Closha:_Bioinformatics_workflow_system_for_the_analysis_of_massive_sequencing_data\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-NagasakiDDBJ13-16\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-NagasakiDDBJ13_16-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Nagasaki, H.; Mochizuki, T.; Kodama, Y. et al. (2013). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3738164\" target=\"_blank\">\"DDBJ read annotation pipeline: A cloud computing-based pipeline for high-throughput analysis of next-generation sequencing data\"<\/a>. <i>DNA Research<\/i> <b>20<\/b> (4): 383-90. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1093%2Fdnares%2Fdst017\" target=\"_blank\">10.1093\/dnares\/dst017<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3738164\/\" target=\"_blank\">PMC3738164<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/23657089\" target=\"_blank\">23657089<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3738164\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3738164<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=DDBJ+read+annotation+pipeline%3A+A+cloud+computing-based+pipeline+for+high-throughput+analysis+of+next-generation+sequencing+data&rft.jtitle=DNA+Research&rft.aulast=Nagasaki%2C+H.%3B+Mochizuki%2C+T.%3B+Kodama%2C+Y.+et+al.&rft.au=Nagasaki%2C+H.%3B+Mochizuki%2C+T.%3B+Kodama%2C+Y.+et+al.&rft.date=2013&rft.volume=20&rft.issue=4&rft.pages=383-90&rft_id=info:doi\/10.1093%2Fdnares%2Fdst017&rft_id=info:pmc\/PMC3738164&rft_id=info:pmid\/23657089&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3738164&rfr_id=info:sid\/en.wikipedia.org:Journal:Closha:_Bioinformatics_workflow_system_for_the_analysis_of_massive_sequencing_data\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-KimTopHat13-17\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-KimTopHat13_17-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Kim, D.; Pertea, G.; Trapnell, C. et al. (2013). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4053844\" target=\"_blank\">\"TopHat2: Accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions\"<\/a>. <i>Genome Biology<\/i> <b>14<\/b> (4): R36. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1186%2Fgb-2013-14-4-r36\" target=\"_blank\">10.1186\/gb-2013-14-4-r36<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4053844\/\" target=\"_blank\">PMC4053844<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/23618408\" target=\"_blank\">23618408<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4053844\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4053844<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=TopHat2%3A+Accurate+alignment+of+transcriptomes+in+the+presence+of+insertions%2C+deletions+and+gene+fusions&rft.jtitle=Genome+Biology&rft.aulast=Kim%2C+D.%3B+Pertea%2C+G.%3B+Trapnell%2C+C.+et+al.&rft.au=Kim%2C+D.%3B+Pertea%2C+G.%3B+Trapnell%2C+C.+et+al.&rft.date=2013&rft.volume=14&rft.issue=4&rft.pages=R36&rft_id=info:doi\/10.1186%2Fgb-2013-14-4-r36&rft_id=info:pmc\/PMC4053844&rft_id=info:pmid\/23618408&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4053844&rfr_id=info:sid\/en.wikipedia.org:Journal:Closha:_Bioinformatics_workflow_system_for_the_analysis_of_massive_sequencing_data\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LawRNA16-18\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-LawRNA16_18-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Law, C.W.; Alhamdoosh, M.; Su, S. et al. (2016). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4937821\" target=\"_blank\">\"RNA-seq analysis is easy as 1-2-3 with limma, Glimma and edgeR\"<\/a>. <i>F1000Research<\/i> <b>5<\/b>: 1408. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.12688%2Ff1000research.9005.2\" target=\"_blank\">10.12688\/f1000research.9005.2<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4937821\/\" target=\"_blank\">PMC4937821<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/27441086\" target=\"_blank\">27441086<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4937821\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4937821<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=RNA-seq+analysis+is+easy+as+1-2-3+with+limma%2C+Glimma+and+edgeR&rft.jtitle=F1000Research&rft.aulast=Law%2C+C.W.%3B+Alhamdoosh%2C+M.%3B+Su%2C+S.+et+al.&rft.au=Law%2C+C.W.%3B+Alhamdoosh%2C+M.%3B+Su%2C+S.+et+al.&rft.date=2016&rft.volume=5&rft.pages=1408&rft_id=info:doi\/10.12688%2Ff1000research.9005.2&rft_id=info:pmc\/PMC4937821&rft_id=info:pmid\/27441086&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4937821&rfr_id=info:sid\/en.wikipedia.org:Journal:Closha:_Bioinformatics_workflow_system_for_the_analysis_of_massive_sequencing_data\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<\/ol><\/div>\n<h2><span class=\"mw-headline\" id=\"Notes\">Notes<\/span><\/h2>\n<p>This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added.\n<\/p>\n<!-- \nNewPP limit report\nCached time: 20181214185731\nCache expiry: 86400\nDynamic content: false\nCPU time usage: 0.531 seconds\nReal time usage: 0.564 seconds\nPreprocessor visited node count: 16915\/1000000\nPreprocessor generated node count: 29814\/1000000\nPost\u2010expand include size: 163918\/2097152 bytes\nTemplate argument size: 51259\/2097152 bytes\nHighest expansion depth: 18\/40\nExpensive parser function count: 0\/100\n-->\n\n<!-- \nTransclusion expansion time report (%,ms,calls,template)\n100.00% 516.436 1 - -total\n 83.41% 430.761 1 - Template:Reflist\n 76.96% 397.431 18 - Template:Cite_journal\n 73.62% 380.222 18 - Template:Citation\/core\n 13.50% 69.718 52 - Template:Citation\/identifier\n 11.89% 61.389 1 - Template:Infobox_journal_article\n 11.43% 59.005 1 - Template:Infobox\n 6.79% 35.042 80 - Template:Infobox\/row\n 4.92% 25.434 120 - Template:Hide_in_print\n 4.80% 24.808 18 - Template:Citation\/make_link\n-->\n\n<!-- Saved in parser cache with key limswiki:pcache:idhash:10452-0!*!0!!en!5!* and timestamp 20181214185731 and revision id 32651\n -->\n<\/div><div class=\"printfooter\">Source: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Closha:_Bioinformatics_workflow_system_for_the_analysis_of_massive_sequencing_data\">https:\/\/www.limswiki.org\/index.php\/Journal:Closha:_Bioinformatics_workflow_system_for_the_analysis_of_massive_sequencing_data<\/a><\/div>\n\t\t\t\t\t\t\t\t\t\t<!-- end content -->\n\t\t\t\t\t\t\t\t\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t\t<!-- end of the left (by default at least) column -->\n\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t\t\n\t\t<\/div>\n\t\t\n\n<\/body>","684d7a3a2f6583b431b16f4884c7c07d_images":["https:\/\/www.limswiki.org\/images\/8\/8a\/Fig1_Closha_BMCBioinfo2018_19-Sup1.gif","https:\/\/www.limswiki.org\/images\/a\/a7\/Fig2_Closha_BMCBioinfo2018_19-Sup1.gif","https:\/\/www.limswiki.org\/images\/9\/98\/Fig3_Closha_BMCBioinfo2018_19-Sup1.gif","https:\/\/www.limswiki.org\/images\/f\/fd\/Fig4_Closha_BMCBioinfo2018_19-Sup1.gif","https:\/\/www.limswiki.org\/images\/b\/b8\/Fig5_Closha_BMCBioinfo2018_19-Sup1.gif"],"684d7a3a2f6583b431b16f4884c7c07d_timestamp":1544813851,"a3349d5e1cf1d4519948fbbbfffe0deb_type":"article","a3349d5e1cf1d4519948fbbbfffe0deb_title":"Developing a bioinformatics program and supporting infrastructure in a biomedical library (Hosburgh 2018)","a3349d5e1cf1d4519948fbbbfffe0deb_url":"https:\/\/www.limswiki.org\/index.php\/Journal:Developing_a_bioinformatics_program_and_supporting_infrastructure_in_a_biomedical_library","a3349d5e1cf1d4519948fbbbfffe0deb_plaintext":"\n\n\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\n\n\t\t\t\tJournal:Developing a bioinformatics program and supporting infrastructure in a biomedical library\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tFrom LIMSWiki\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\tJump to: navigation, search\n\n\t\t\t\t\t\n\t\t\t\t\tFull article title\n \nDeveloping a bioinformatics program and supporting infrastructure in a biomedical libraryJournal\n \nJournal of eScience LibrarianshipAuthor(s)\n \nHosburgh, NathanAuthor affiliation(s)\n \nNational Institutes of HealthPrimary contact\n \nEmail: Nathan dot Hosburgh at nih dot govYear published\n \n2018Volume and issue\n \n7(2)Page(s)\n \ne1129DOI\n \n10.7191\/jeslib.2018.1129ISSN\n \n2161-3974Distribution license\n \nCreative Commons Attribution 4.0 InternationalWebsite\n \nhttps:\/\/escholarship.umassmed.edu\/jeslib\/vol7\/iss2\/2\/Download\n \nhttps:\/\/escholarship.umassmed.edu\/cgi\/viewcontent.cgi?article=1129&context=jeslib (PDF)\n\nContents\n\n1 Abstract \n2 Introduction and background \n3 Study purpose \n4 Case presentation \n5 Discussion \n6 Acknowledgements \n7 Disclosure \n8 References \n9 Notes \n\n\n\nAbstract \nBackground: Over the last couple decades, the field of bioinformatics has helped spur medical discoveries that offer a better understanding of the genetic basis of disease, which in turn improve public health and save lives. Concomitantly, support requirements for molecular biology researchers have grown in scope and complexity, incorporating specialized resources, technologies, and techniques.\nCase presentation: To address this specific need among National Institutes of Health (NIH) intramural researchers, the NIH Library hired an expert bioinformatics trainer and consultant with a PhD in biochemistry to implement a bioinformatics support program. This study traces the program from its inception in 2009 to its present form. Discussion involves the particular skills of program staff, development of content, collection of resources, associated technology, assessment, and the impact of the program on the NIH community.\nConclusion: Based on quantitative and qualitative data, the bioinformatics support program has been heavily used and appreciated by researchers. Continued success will depend on filling key staff positions, building on the existing program infrastructure, and keeping abreast of developments within the field to remain relevant and in touch with the medical research community utilizing bioinformatics services. \nKeywords: bioinformatics, bioinformatics support program, biomedical library\n\nIntroduction and background \nIn the context of an ever-expanding information landscape, those involved in biomedical research have become increasingly reliant on the use of bioinformatics to analyze large amounts of complex data. Bioinformatics is an interdisciplinary field involving molecular biology and genetics, computer science, mathematics, and statistics. Large-scale biological problems, such as modeling biological processes, are addressed from a computational point of view so that inferences can be made from aggregate data.[1] As stated by Rein[2], \u201cBioinformatics research advances in such areas as gene therapy, personalized medicine, drug discovery, the inherited basis of complex diseases influenced by multiple gene\/ environmental interactions, and the identification of the molecular targets for environmental mutagens and carcinogens have wide ranging implications for the medical and consumer health sectors.\u201d[2] The field of bioinformatics has seen explosive growth since the mid-1990s, spurred by the Human Genome Project and rapid advances in DNA sequencing technology.\nDespite the importance of bioinformatics in advancing scientific research, it has been observed that most researchers in the life sciences do not have the necessary training to take advantage of the array of bioinformatics tools and resources available to them due to the rapidly evolving, interdisciplinary nature of the field.[3] Extensive technological changes, new databases and software, and changes in the types and quantity of data combine to pose formidable challenges to the uninitiated. Likewise, few biomedical librarians have the training, experience, or subject expertise required to provide robust bioinformatics services such as interpretation of molecular sequence database search results, pathway analysis, and data analysis from the latest biotechnology advances. Therefore, some institutions have recruited individuals with advanced degrees in biology or biochemistry and a strong background in bioinformatics to assess the molecular biological information needs of researchers and design strategies to enhance library resources and services in the areas of consultation, education, and resource development.[2][4][5]\nAs library involvement in bioinformatics has grown, particularly across research and clinical settings, the role of the health information professional as \u201cinformationist\u201d has become more prominent. Specifically, in the \u201cbioinformaticist\u201d role, the information professional possesses advanced subject knowledge in information science as well as applied technical and biological skills.[6][7] Those responsible for building library bioinformatics programs must discern user needs and skills, identify existing services, develop plans for new services, recruit and train specialized staff, establish collaborations with other centers at their institutions, and assess the success of such programs.[8][9] If executed effectively, library involvement in bioinformatics support services has the potential to contribute to the process of scientific discovery and save the research community valuable time and money.\n\nStudy purpose \nThe purpose of this case study is to outline the process of creating, developing, and assessing a bioinformatics support program at the National Institutes of Health in Bethesda, Maryland. \n\nCase presentation \nThe National Institutes of Health (NIH), a part of the U.S. Department of Health and Human Services, is the nation\u2019s medical research agency. Located in the Clinical Research Center at the heart of campus, the NIH Library supports the clinical care and research of the intramural community, which leads to discoveries that improve public health and save lives. In addition to bioinformatics, the NIH Library provides services in bibliometrics, custom information solutions, data management and analysis, document delivery, editing, literature searching, research assistance, systematic reviews, training, and translations.[10]\nIn 2008, the National Center for Biotechnology Information (NCBI) scaled back its bioinformatics training program, creating a need for other groups to offer the training previously provided by the NCBI. The NIH Library, in keeping with its objective to support intramural research in genetics and bioinformatics more comprehensively, stepped in to fill that void by offering training specifically geared towards NIH investigators. \nIn February 2009, the NIH Library hired an expert bioinformatics trainer and consultant, Dr. Medha Bhagwat, to support bioinformatics research at NIH. Up to this point, the library did not offer bioinformatics support services. Dr. Bhagwat arrived from NCBI with 11 years of bioinformatics experience as well as diverse expertise in biochemistry and structural biology.\nDuring her tenure at NCBI, Dr. Bhagwat developed and taught several two-hour mini-courses dealing with the effective use of specialized bioinformatics tools. These included \u201cquick start\u201d courses on analyzing microbial genomes, structural analysis, identification of disease genes, correlating disease genes and phenotypes, understanding DNA and protein sequences, and utilizing tools such as BLAST, Entrez Gene, MapViewer, and GenBank. Leveraging the courses and training she had previously developed at NCBI, Dr. Bhagwat was able to create classes tailored to the specific bioinformatics needs of the NIH intramural research community.[11] Previous work as a bench scientist endowed her with an understanding of the needs and terminology particular to biomedical researchers. The fact that Dr. Bhagwat had been employed on the NIH campus since 1994 meant that she had also generated a strong internal network and was able to feel the pulse of the research community. These qualities combined to immediately make Dr. Bhagwat a valuable resource in her new role at the NIH Library.\nAlthough Dr. Bhagwat had the expertise, experience, and training as a bioinformaticist, preliminary work was necessary to build a comprehensive bioinformatics support program. She began by researching bioinformatics support programs at prominent medical libraries and found that such programs include one or more of the following: instruction, licensing, computing software, collections, resource development such as online tutorials, and frameworks for collaborations among researchers. She then sought to identify the requirements of the NIH research community via a three-pronged approach: interviews with bioinformatics specialists at several NIH institutes, direct interaction with researchers during early training and consultation sessions, and a formal survey of NIH scientists. An initial bioinformatics support program was established, consisting of classroom training, one-on-one tutorials and consultation, online tutorials, software and database licenses, high-performance computers, and a collection of books, journals, and other literature.\nClassroom training is taught by NIH Library staff as well as outside speakers, including subject and product experts supplied by bioinformatics software vendors. Most of the classroom instruction is provided in the library training room with additional live streaming over WebEx in some cases. Dr. Bhagwat formed strategic partnerships with several institutes to teach on-site training programs offered to extramural scientists, medical professionals, educators, and students at other facilities. These partnerships have helped expand the reach of the NIH Library\u2019s bioinformatics support program and have fostered a network of bioinformatics experts across campus. Examples include the National Institute of Nursing Research (NINR) Precision Health Boot Camp[12] and the Summer Genetics Institute for nurses[13]; the National Human Genome Research Institute (NHGRI) Short Course in Genomics[14] for middle- and high-school teachers, community college, and tribal-college faculty; and the National Library of Medicine\u2019s (NLM) remote hands-on classes hosted by university libraries for academic researchers.[15][16] Dr. Bhagwat taught a two-credit course \u201cPractical Bioinformatics\u201d at the Foundation for Advanced Education Sciences (FAES) at NIH annually during the fall semester[17], and she gave lectures at Georgetown University as adjunct faculty and provided continuing education courses at both the Medical Library Association[18] and Special Library Association conferences.[19] The annual NIH Library Bioinformatics Research Symposium serves as a great example of a collaborative endeavor in which the Library organizes a two-day event featuring a series of scientific presentations highlighting practical applications of the analysis tools and databases licensed by the NIH Library for NIH researchers. The presenters are all scientists from NIH or relevant companies offering such bioinformatics tools.[20]\nExamples of bioinformatics classes led by Dr. Bhagwat at NIH include: Making Sense of DNA and Protein Sequences; Gene Resources: From Transcription Factor Binding Sites to Function; Sequence Similarity Search: BLAST-Like Alignment Tool (BLAT); Protein Structural Analysis: Binding Sites to Distant Homologs; Genome Browsers; Identification of Disease Genes; Correlation of Disease Genes to Phenotypes; Microbial Genome Analysis; Gene Expression Microarray Data Analysis; Next Gen Sequence Analysis; Gene Expression Omnibus; and Introduction to Clinical Genomics.\nIn addition, specific training is done by vendor-provided experts on the following proprietary bioinformatics software: CLC Biomedical Genomics Workbench, DNASTAR Lasergene, ArrayStar Qseq, SeqMan NGen, Metacore, MetaGeneMark, GeneIndexer, GeneSpring, Genomatix Genome Analyzer, Golden Helix SVS and VarSeq, Human Gene Mutation Database Professional, Ingenuity Pathways Analysis, Partek Genomics Suite, Pathway Studio, and ProteinLounge.\nDepending on the software, the library provides online access via floating licenses or directly on three specialized bioinformatics workstations, two of which have identical specifications for typical high-throughput analysis: Windows 7 64-bit, 8 cores, 48 GB RAM, and 2 TB disk space. The third workstation is designed specifically to run CLC Genomics Workbench, an application for analyzing and visualizing next-generation sequencing (NGS) data. The specifications of this computer are more robust due to the demanding requirements of this sort of data analysis: Red Hat Enterprise Linux 6 64-bit, 28 cores, 512 GB RAM, and 24 TB disk space. However, even with these computing capabilities, the workstations often run overnight in order to complete such analyses. \nIn order to bolster support for the burgeoning bioinformatics program, a second staff member was hired in August 2010. Dr. Lynn Young has a PhD in physics, with computer programming experience as well as expertise in microarray and next-generation sequencing data analysis. Employing years of teaching experience, Dr. Bhagwat provides classroom instruction and organizes vendor-led instruction, while Dr. Young devotes more time to individual and small group consultations, either on the bioinformatics workstations or in her office. Due to her background in computer science and bioinformatics, Young is uniquely positioned to collaborate with NIH researchers by assisting with using software, writing scripts, and interpreting the results of complex analyses. When a researcher needs a tutorial before Dr. Young is available, she is able to refer them to a short video tutorial outlining the analysis of next-generation sequencing data using specific software and follow up later with an in-person meeting. Examples of tutorial and consultation topics include: download upstream gene sequence and identify transcription factor binding sites; gene set enrichment\/pathway analysis from microarray experiments; and next-gen sequence analysis using RNA-Seq, ChIP-Seq, and miRNASeq.\nIn response to the heavy demands of instruction and consultation, the Bioinformatics Workgroup was formed to handle some of the administrative functions of the program. This workgroup consists of library staff members who are not bioinformaticists but support the program in various ways; these support roles were realized by reallocating resources among existing NIH Library staff. Support activities include communicating with vendors; scheduling and keeping an up-to-date training calendar; organizing qualitative and quantitative data from testimonials and evaluation forms; and compiling statistics on classes, tutorials, off-site presentations, workstation reservations, software usage, and other metrics that feed into assessment of the program.\nThe most comprehensive formal assessment covers the 2016 calendar year in which 50 training sessions were provided to a total of 1,475 participants. The Bioinformatics Workgroup adjusts strategies for advertising and works with the library's Communication Workgroup to make such training available to the most attendees possible. For example, the group decided to raise the cap on registrants for each class and to publicize to people on the waiting list that, if they arrive early to class and sign in, they would be given any empty seat once the class began.\nTable 1 shows a list of vendor-led training during 2016. This training for fee-based resources is typically provided as part of the library\u2019s subscription. It gives vendors an opportunity to promote their resources and enables the user community to gain targeted experience with specialized tools. \n\n\n\n\n\n\n\nTable 1. Vendor-led training\n\n\nClass\n\nAttendees\n\n\nGeneSpring 13.1\n\n16\n\n\nPathway Studio\n\n9\n\n\nGeneSpring 13.1\n\n16\n\n\nPartek Flow\n\n17\n\n\nPartek Genomics Suite\n\n14\n\n\nIngenuity Pathway Analysis\n\n27\n\n\nGenomatix\n\n19\n\n\nPathway Studio\n\n16\n\n\nGeneSpring 13.1\n\n16\n\n\nQIAGEN Ingenuity Variant Analysis\n\n12\n\n\nPartek Flow\n\n20\n\n\nQIAGEN CLC Genomics Workbench\n\n24\n\n\nPathway Studio\n\n15\n\n\nPartek Genomics\n\n14\n\n\nMetaCore\n\n12\n\n\nGeneSpring 14.5\n\n12\n\n\nGeneSpring 14.5\n\n15\n\n\n\nPartnerships have been formed with other NIH institutes to provide training in library facilities, while library program staff also provide training for them at their own centers (see Table 2). For example, the National Cancer Institute (NCI) and the National Institute of Allergy and Infectious Diseases (NIAID) offered an exome sequencing analysis class in the library during 2016.\n\n\n\n\n\n\n\nTable 2. Strategic partner-led training\n\n\nClass\n\nAttendees\n\n\nRNA-Seq\n\n34\n\n\nOmicCircos\n\n14\n\n\nChiP-Seq Analysis\n\n27\n\n\nExome Sequencing Analysis\n\n33\n\n\nPathway Analysis\n\n16\n\n\n\nAnd Table 3 represents classes led by NIH Library bioinformatics staff during 2016.\n\n\n\n\n\n\n\nTable 3. Staff-led training\n\n\nClass\n\nLocation\n\nAttendees\n\nInstructor\n\n\nGenome Browsers\n\nNIA, Baltimore\n\n24\n\nBhagwat\n\n\nTCGA\n\nNIHL\n\n20\n\nBhagwat\n\n\nIntroduction to Clinical Genomics\n\nNIHL\n\n24\n\nBhagwat\n\n\nIntroduction to Clinical Genomics\n\nNLM and remote\n\n40\n\nBhagwat\n\n\nGene Expression Omnibus\n\nNIHL\n\n19\n\nBhagwat\n\n\nGene Resources\n\nGeorgetown University\n\n20\n\nBhagwat\n\n\nGenome Browsers\n\nNIHL\n\n14\n\nBhagwat\n\n\nPathway Analysis\n\nNICHD\n\n25\n\nBhagwat\n\n\nGene Resources\n\nNIHL\n\n27\n\nBhagwat\n\n\nSequence Analysis\n\nNIHL\n\n26\n\nBhagwat\n\n\nBioinformatics Introduction to SQL\n\nNIHL\n\n11\n\nYoung\n\n\nNINR SGI Program\n\nNIHL\n\n36\n\nBhagwat\n\n\nBioinformatics Symposium\n\nNIHL\n\n320\n\nBhagwat\n\n\nBLAST\n\nNIHL\n\n20\n\nBhagwat\n\n\nNINR Boot Camp\n\nNIHL\n\n170\n\nBhagwat\n\n\nNHGRI Bio 101\n\nNIHL\n\n25\n\nBhagwat\n\n\nBLAT\n\nNIHL\n\n13\n\nBhagwat\n\n\nMaking Sense\n\nNIHL\n\n20\n\nBhagwat\n\n\nGene Resources\n\nNIHL\n\n20\n\nBhagwat\n\n\nGenome Browser\n\nNIHL\n\n20\n\nBhagwat\n\n\nGEO\n\nNIHL\n\n20\n\nBhagwat\n\n\nClinical Genomics\n\nNIHL\n\n20\n\nBhagwat\n\n\nBLAST\n\nNIHL\n\n20\n\nBhagwat\n\n\nBLAT\n\nNIHL\n\n20\n\nBhagwat\n\n\nNext Gen\n\nNIHL\n\n20\n\nBhagwat\n\n\nTCGA\n\nNIHL\n\n20\n\nBhagwat\n\n\n\nIn order to use networked bioinformatics resources, NIH affiliates are required to register for access to a particular resource so that an individual account is created. The highest number of new registrations in 2016 were recorded for Ingenuity software; the National Cancer Institute had the most new registrants overall. A total of 524 reservations were made for the bioinformatics workstations. Workstation 2, the only workstation with CLC Genomics Workbench software used to align short sequence reads to a genome sequence (big data analysis), had the most reservations. Partek Genomics Suite was used the most on Workstation 1. Genomatix, Golden Helix SVS, and Pathway Studio represent the software reserved most frequently on Workstation 3. The National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) booked the most reservations of any institute in 2016.\nAlthough quantitative data is useful in evaluating the program, researchers have often indicated the value of instruction and consultation by providing qualitative feedback. This is most often received by bioinformatics staff via email and surveys. Below is a one of many positive comments from a course participant:\n\nDear Medha,\r\n\nYear after year, we put together an outstanding selection of speakers for Short course. Year after year\u2014against that background of excellence\u2014your bioinformatics workshop literally blows up the brains of our teachers. One super experienced teacher in the area of bioinformatics summed it all for me by simply saying \u201cthe best bioinformatics workshop I ever attended.\u201d Thank you for your commitment to our educational projects. You are truly a cornerstone of our course.\nDiscussion \nThe NIH Library bioinformatics program has served more than 10,000 participants directly through classroom training and individual consultations since 2009. Drawing from the quantitative and qualitative data, it is clear that the NIH Library Bioinformatics Support Program is well-used and appreciated by researchers. However, in order to remain relevant, it is important to understand the evolving needs of the NIH research community. Based on the experience of bioinformatics staff, it is necessary to be in regular contact with NIH researchers as well as the larger bioinformatics and library communities. Focused conferences, seminars, and individual consultations with investigators offer excellent opportunities for keeping track of current trends. Within the NIH research community, consultations in particular afforded staff the best opportunity for understanding the topics, effective modes of training, and resources required by researchers for bioinformatics analysis. These interactions indicated that many users benefit from individualized training in ways that large group training and webinars cannot address. In this setting, bench scientists and clinicians are able to engage in substantive conversation with informationists, discuss ideas, and directly apply knowledge to real-world problems in real-time. Forming a network of bioinformatics experts throughout the NIH community has also been a key factor in the growth and success of the program.\nIn the coming years, as new biotechnologies emerge, staff must identify cutting edge trends and emerging needs and make modifications\u2014both qualitatively and quantitatively\u2014in certain aspects of the program. For example, more in-person classes may be needed to accommodate demand for this format as evidenced by feedback on evaluation forms. And more online offerings tailored towards the library community at large might be provided to reach a broader audience and enable learned application of general bioinformatics concepts using practical techniques. In regard to data infrastructure, storage, and analysis, staff will need to work closely with the NIH Library Information Architecture Branch to investigate the merits of cloud computing versus high-performance workstations and associated servers supported by NIH, although reliable network speed is a potential limiting factor for moving in this direction. Government security in a networked environment is also a perennial concern, and the library must find comprehensive solutions for data backup and storage.\nIn July 2017, Dr. Bhagwat retired from the NIH Library. She was instrumental in creating the bioinformatics support program in 2009 and has been a cornerstone since that time. It remains to be seen whether her role can be filled by someone with the necessary experience, enthusiasm, and vision, not only to keep the program running, but to foster innovation and build on past successes. As with the NIH, institutions that have recruited individuals with advanced degrees in the biosciences into such roles have been able to create and sustain successful bioinformatics support programs.[2][4][5] While it takes a leader to spearhead such an endeavor, a dedicated support team is necessary to handle some of the administrative aspects such as scheduling, promotion, and data collection. In this way, subject experts can devote more of their time to directly assisting researchers.\nThe NIH Library Bioinformatics Support Program has grown to encompass staff and vendor-led classes, in-person consultations, online tutorials, high-performance workstations, analysis tools and databases, and other curated bioinformatics resources.[20] As this program evolves, the NIH Library strives to provide a dynamic and valuable suite of bioinformatics services to NIH and the larger medical research community well into the future.\n\nAcknowledgements \nThe author would like to thank Dr. Medha Bhagwat for supplying much of the information regarding the details of the bioinformatics program, Dr. Lynn Young for leading the initiative to evaluate the NIH Library Bioinformatics Program, and Lisa Federer for proposing that such a case study be written.\n\nDisclosure \nThe author reports no conflict of interest. Products named are for informational purposes only. The NIH Library does not endorse specific software or databases. \n\nReferences \n\n\n\u2191 Can, T. (2014). \"Introduction to Bioinformatics\". In Yousef, M.; Allmer, J.. miRNomics: MicroRNA Biology and Computational Analysis. Methods in Molecular Biology. 1107. Humana Press. doi:10.1007\/978-1-62703-748-8_4. ISBN 9781627037488.   \n\n\u2191 2.0 2.1 2.2 2.3 Rein, D.C. (2006). \"Developing library bioinformatics services in context: The Purdue University Libraries bioinformationist program\". Journal of the Medical Library Association 94 (3): 314\u201320. PMC PMC1525331. PMID 16888666. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC1525331 .   \n\n\u2191 Schneider, M.V.; Watson, J.; Attwood, T. et al. (2010). \"Bioinformatics training: A review of challenges, actions and support requirements\". Briefings in Bioinformatics 11 (6): 544\u201351. doi:10.1093\/bib\/bbq021. PMID 20562256.   \n\n\u2191 4.0 4.1 Li, M.; Chen, Y.B.; Clintworth, W.A. (2013). \"Expanding roles in a library-based bioinformatics service program: A case study\". Journal of the Medical Library Association 101 (4): 303\u20139. doi:10.3163\/1536-5050.101.4.012. PMC PMC3794686. PMID 24163602. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3794686 .   \n\n\u2191 5.0 5.1 Yarfitz, S.; Ketchell, D.S. (2000). \"A library-based bioinformatics services program\". Bulletin of the Medical Library Association 88 (1): 36\u201348. PMC PMC35196. PMID 10658962. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC35196 .   \n\n\u2191 Davidoff, F.; Florance, V. (2000). \"The informationist: A new health profession?\". Annals of Internal Medicine 132 (12): 996\u20138. doi:10.7326\/0003-4819-132-12-200006200-00012. PMID 10858185.   \n\n\u2191 Helms, A.J.; Bradford, K.D.; Warren, N.J.; Schwartz, D.G. (2004). \"Bioinformatics opportunities for health sciences librarians and information professionals\". Journal of the Medical Library Association 92 (4): 489\u201393. PMC PMC521520. PMID 15494764. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC521520 .   \n\n\u2191 Helms, A.J.; Bradford, K.D.; Warren, N.J.; Schwartz, D.G. (2004). \"Bioinformatics opportunities for health sciences librarians and information professionals\". Journal of the Medical Library Association 92 (4): 489\u201393. PMC PMC521520. PMID 15494764. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC521520 .   \n\n\u2191 Lyon, J.A.; Tennant, M.R.; Messner, K.R.; Osterbur, D.L. (2006). \"Carving a niche: Establishing bioinformatics collaborations\". Journal of the Medical Library Association 94 (3): 330\u20135. PMC PMC1525329. PMID 16888668. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC1525329 .   \n\n\u2191 \"About Us\". NIH Library. National Institutes of Health. https:\/\/nihlibrary.nih.gov\/about-us . Retrieved 09 March 2018 .   \n\n\u2191 Bhagwat, M.; Wheeler, D.; Valjavec-Gratian, M. (2006). \"Mini Courses\". National Center for Biotechnology Information. https:\/\/www.ncbi.nlm.nih.gov\/Class\/minicourses\/ .   \n\n\u2191 \"NINR \"Precision Health: Smart Technologies, Smart Health\u201d Boot Camp\". National Institute of Nursing Research. National Institutes of Health. https:\/\/www.ninr.nih.gov\/training\/trainingopportunitiesintramural\/bootcamp . Retrieved 09 March 2018 .   \n\n\u2191 \"Summer Genetics Institute (SGI)\". National Institute of Nursing Research. National Institutes of Health. https:\/\/www.ninr.nih.gov\/training\/trainingopportunitiesintramural\/summergeneticsinstitute . Retrieved 09 March 2018 .   \n\n\u2191 \"National Human Genome Research Institute Short Course in Genomics\". National Human Genome Research Institute. National Institutes of Health. https:\/\/www.genome.gov\/10000217\/nhgri-short-course-in-genomics\/ . Retrieved 09 March 2018 .   \n\n\u2191 \"Bioinformatics: Clinical Genomics Subject of Mini Course\". CDU Newsletter. Charles R. Drew University of Medicine and Science. 01 April 2016. https:\/\/www.cdrewu.edu\/CDUNewsletters\/activenews_view.asp?articleID=719 .   \n\n\u2191 University of Maryland Health Sciences and Human Services Library (2013). \"PubChem Training from NLM\". Connective Issues 7 (4). http:\/\/www2.hshsl.umaryland.edu\/newsletter\/?p=1434 .   \n\n\u2191 \"2015\u20132016 Catalog of Courses and Student Handbook\" (PDF). Foundation for Advanced Education in the Sciences. 2015. p. 24. https:\/\/faes.org\/sites\/default\/files\/files\/FAES%20Catalog%202015-16%20FINAL.pdf .   \n\n\u2191 \"MLA '12 Preliminary Program\" (PDF). Medical Library Association. 2012. p. 15. http:\/\/www.mlanet.org\/d\/do\/1854 .   \n\n\u2191 Hooper-Lane, C. (2010). \"2010 Conference Program Preview\" (PDF). Biofeedback 35 (2): 2. http:\/\/dbiosla.org\/publications\/pubs\/biofeedback\/Spring2010.pdf .   \n\n\u2191 20.0 20.1 \"Bioinformatics Support Program\". NIH Library. National Institutes of Health. https:\/\/nihlibrary.nih.gov\/services\/bioinformatics-support . Retrieved 09 March 2018 .   \n\n\nNotes \nThis presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added. The original article lists references in alphabetical order, by author; this version lists them in order of appearance, by design.\n\n\n\n\n\n\nSource: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Developing_a_bioinformatics_program_and_supporting_infrastructure_in_a_biomedical_library\">https:\/\/www.limswiki.org\/index.php\/Journal:Developing_a_bioinformatics_program_and_supporting_infrastructure_in_a_biomedical_library<\/a>\n\t\t\t\t\tCategories: LIMSwiki journal articles (added in 2018)LIMSwiki journal articles (all)LIMSwiki journal articles on bioinformatics\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\n\t\t\n\t\t\tNavigation menu\n\t\t\t\t\t\n\t\t\tViews\n\n\t\t\t\n\t\t\t\t\n\t\t\t\tJournal\n\t\t\t\tDiscussion\n\t\t\t\tView source\n\t\t\t\tHistory\n\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\t\n\t\t\t\tPersonal tools\n\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\t\t\tLog in\n\t\t\t\t\t\t\t\t\t\t\t\t\tRequest account\n\t\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\t\n\t\tNavigation\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tMain page\n\t\t\t\t\t\t\t\t\t\t\tRecent changes\n\t\t\t\t\t\t\t\t\t\t\tRandom page\n\t\t\t\t\t\t\t\t\t\t\tHelp\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tSearch\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t \n\t\t\t\t\t\t\n\t\t\t\t\n\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tTools\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tWhat links here\n\t\t\t\t\t\t\t\t\t\t\tRelated changes\n\t\t\t\t\t\t\t\t\t\t\tSpecial pages\n\t\t\t\t\t\t\t\t\t\t\tPermanent link\n\t\t\t\t\t\t\t\t\t\t\tPage information\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\tPrint\/export\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tCreate a book\n\t\t\t\t\t\t\t\t\t\t\tDownload as PDF\n\t\t\t\t\t\t\t\t\t\t\tDownload as Plain text\n\t\t\t\t\t\t\t\t\t\t\tPrintable version\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\n\t\tSponsors\n\t\t\n\t\t\t \r\n\n\t\r\n\n\t\r\n\n\t\r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n\t\n\t\r\n\n\t\r\n\n\t\r\n\n\t\r\n\t\t\n\t\t\n\t\t\t\n\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t This page was last modified on 27 March 2018, at 21:58.\n\t\t\t\t\t\t\t\t\tThis page has been accessed 283 times.\n\t\t\t\t\t\t\t\t\tContent is available under a Creative Commons Attribution-ShareAlike 4.0 International License unless otherwise noted.\n\t\t\t\t\t\t\t\t\tPrivacy policy\n\t\t\t\t\t\t\t\t\tAbout LIMSWiki\n\t\t\t\t\t\t\t\t\tDisclaimers\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\t\n\n","a3349d5e1cf1d4519948fbbbfffe0deb_html":"<body class=\"mediawiki ltr sitedir-ltr ns-206 ns-subject page-Journal_Developing_a_bioinformatics_program_and_supporting_infrastructure_in_a_biomedical_library skin-monobook action-view\">\n<div id=\"rdp-ebb-globalWrapper\">\n\t\t<div id=\"rdp-ebb-column-content\">\n\t\t\t<div id=\"rdp-ebb-content\" class=\"mw-body\" role=\"main\">\n\t\t\t\t<a id=\"rdp-ebb-top\"><\/a>\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t<h1 id=\"rdp-ebb-firstHeading\" class=\"firstHeading\" lang=\"en\">Journal:Developing a bioinformatics program and supporting infrastructure in a biomedical library<\/h1>\n\t\t\t\t\n\t\t\t\t<div id=\"rdp-ebb-bodyContent\" class=\"mw-body-content\">\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\n\t\t\t\t\t<!-- start content -->\n\t\t\t\t\t<div id=\"rdp-ebb-mw-content-text\" lang=\"en\" dir=\"ltr\" class=\"mw-content-ltr\">\n\n\n<h2><span class=\"mw-headline\" id=\"Abstract\">Abstract<\/span><\/h2>\n<p><b>Background<\/b>: Over the last couple decades, the field of <a href=\"https:\/\/www.limswiki.org\/index.php\/Bioinformatics\" title=\"Bioinformatics\" target=\"_blank\" class=\"wiki-link\" data-key=\"8f506695fdbb26e3f314da308f8c053b\">bioinformatics<\/a> has helped spur medical discoveries that offer a better understanding of the genetic basis of disease, which in turn improve public health and save lives. Concomitantly, support requirements for molecular biology researchers have grown in scope and complexity, incorporating specialized resources, technologies, and techniques.\n<\/p><p><b>Case presentation<\/b>: To address this specific need among <a href=\"https:\/\/www.limswiki.org\/index.php\/National_Institutes_of_Health\" title=\"National Institutes of Health\" target=\"_blank\" class=\"wiki-link\" data-key=\"e5c215c48e73ae58b0695dc2af951cd0\">National Institutes of Health<\/a> (NIH) intramural researchers, the NIH Library hired an expert bioinformatics trainer and consultant with a PhD in biochemistry to implement a bioinformatics support program. This study traces the program from its inception in 2009 to its present form. Discussion involves the particular skills of program staff, development of content, collection of resources, associated technology, assessment, and the impact of the program on the NIH community.\n<\/p><p><b>Conclusion<\/b>: Based on quantitative and qualitative data, the bioinformatics support program has been heavily used and appreciated by researchers. Continued success will depend on filling key staff positions, building on the existing program infrastructure, and keeping abreast of developments within the field to remain relevant and in touch with the medical research community utilizing bioinformatics services. \n<\/p><p><b>Keywords<\/b>: bioinformatics, bioinformatics support program, biomedical library\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Introduction_and_background\">Introduction and background<\/span><\/h2>\n<p>In the context of an ever-expanding <a href=\"https:\/\/www.limswiki.org\/index.php\/Information\" title=\"Information\" target=\"_blank\" class=\"wiki-link\" data-key=\"6300a14d9c2776dcca0999b5ed940e7d\">information<\/a> landscape, those involved in <a href=\"https:\/\/www.limswiki.org\/index.php\/Medical_research\" title=\"Medical research\" target=\"_blank\" class=\"wiki-link\" data-key=\"0ee7e4e2a32a422d78fe6bd1ab0d1cbc\">biomedical research<\/a> have become increasingly reliant on the use of bioinformatics to analyze large amounts of complex data. Bioinformatics is an interdisciplinary field involving molecular biology and genetics, computer science, mathematics, and statistics. Large-scale biological problems, such as modeling biological processes, are addressed from a computational point of view so that inferences can be made from aggregate data.<sup id=\"rdp-ebb-cite_ref-CanIntro14_1-0\" class=\"reference\"><a href=\"#cite_note-CanIntro14-1\" rel=\"external_link\">[1]<\/a><\/sup> As stated by Rein<sup id=\"rdp-ebb-cite_ref-ReinDevelop06_2-0\" class=\"reference\"><a href=\"#cite_note-ReinDevelop06-2\" rel=\"external_link\">[2]<\/a><\/sup>, \u201cBioinformatics research advances in such areas as gene therapy, personalized medicine, drug discovery, the inherited basis of complex diseases influenced by multiple gene\/ environmental interactions, and the identification of the molecular targets for environmental mutagens and carcinogens have wide ranging implications for the medical and consumer health sectors.\u201d<sup id=\"rdp-ebb-cite_ref-ReinDevelop06_2-1\" class=\"reference\"><a href=\"#cite_note-ReinDevelop06-2\" rel=\"external_link\">[2]<\/a><\/sup> The field of bioinformatics has seen explosive growth since the mid-1990s, spurred by the Human Genome Project and rapid advances in DNA sequencing technology.\n<\/p><p>Despite the importance of bioinformatics in advancing scientific research, it has been observed that most researchers in the life sciences do not have the necessary training to take advantage of the array of bioinformatics tools and resources available to them due to the rapidly evolving, interdisciplinary nature of the field.<sup id=\"rdp-ebb-cite_ref-SchneiderBioinfo10_3-0\" class=\"reference\"><a href=\"#cite_note-SchneiderBioinfo10-3\" rel=\"external_link\">[3]<\/a><\/sup> Extensive technological changes, new databases and software, and changes in the types and quantity of data combine to pose formidable challenges to the uninitiated. Likewise, few biomedical librarians have the training, experience, or subject expertise required to provide robust bioinformatics services such as interpretation of molecular sequence database search results, pathway analysis, and <a href=\"https:\/\/www.limswiki.org\/index.php\/Data_analysis\" title=\"Data analysis\" target=\"_blank\" class=\"wiki-link\" data-key=\"545c95e40ca67c9e63cd0a16042a5bd1\">data analysis<\/a> from the latest biotechnology advances. Therefore, some institutions have recruited individuals with advanced degrees in biology or biochemistry and a strong background in bioinformatics to assess the molecular biological information needs of researchers and design strategies to enhance library resources and services in the areas of consultation, education, and resource development.<sup id=\"rdp-ebb-cite_ref-ReinDevelop06_2-2\" class=\"reference\"><a href=\"#cite_note-ReinDevelop06-2\" rel=\"external_link\">[2]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-LiExpand13_4-0\" class=\"reference\"><a href=\"#cite_note-LiExpand13-4\" rel=\"external_link\">[4]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-YarfitzALib00_5-0\" class=\"reference\"><a href=\"#cite_note-YarfitzALib00-5\" rel=\"external_link\">[5]<\/a><\/sup>\n<\/p><p>As library involvement in bioinformatics has grown, particularly across research and clinical settings, the role of the health information professional as \u201cinformationist\u201d has become more prominent. Specifically, in the \u201cbioinformaticist\u201d role, the information professional possesses advanced subject knowledge in information science as well as applied technical and biological skills.<sup id=\"rdp-ebb-cite_ref-DavidoffTheInfo00_6-0\" class=\"reference\"><a href=\"#cite_note-DavidoffTheInfo00-6\" rel=\"external_link\">[6]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-HelmsBioinfo04_7-0\" class=\"reference\"><a href=\"#cite_note-HelmsBioinfo04-7\" rel=\"external_link\">[7]<\/a><\/sup> Those responsible for building library bioinformatics programs must discern user needs and skills, identify existing services, develop plans for new services, recruit and train specialized staff, establish collaborations with other centers at their institutions, and assess the success of such programs.<sup id=\"rdp-ebb-cite_ref-GeerBroad06_8-0\" class=\"reference\"><a href=\"#cite_note-GeerBroad06-8\" rel=\"external_link\">[8]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-LyonCarving06_9-0\" class=\"reference\"><a href=\"#cite_note-LyonCarving06-9\" rel=\"external_link\">[9]<\/a><\/sup> If executed effectively, library involvement in bioinformatics support services has the potential to contribute to the process of scientific discovery and save the research community valuable time and money.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Study_purpose\">Study purpose<\/span><\/h2>\n<p>The purpose of this case study is to outline the process of creating, developing, and assessing a bioinformatics support program at the National Institutes of Health in Bethesda, Maryland. \n<\/p>\n<h2><span class=\"mw-headline\" id=\"Case_presentation\">Case presentation<\/span><\/h2>\n<p>The National Institutes of Health (NIH), a part of the <a href=\"https:\/\/www.limswiki.org\/index.php\/United_States_Department_of_Health_and_Human_Services\" title=\"United States Department of Health and Human Services\" target=\"_blank\" class=\"wiki-link\" data-key=\"efa106bcbb93039b1a6c3c596daedec3\">U.S. Department of Health and Human Services<\/a>, is the nation\u2019s medical research agency. Located in the Clinical Research Center at the heart of campus, the NIH Library supports the clinical care and research of the intramural community, which leads to discoveries that improve public health and save lives. In addition to bioinformatics, the NIH Library provides services in bibliometrics, custom information solutions, data management and analysis, document delivery, editing, literature searching, research assistance, systematic reviews, training, and translations.<sup id=\"rdp-ebb-cite_ref-NIHAboutUs_10-0\" class=\"reference\"><a href=\"#cite_note-NIHAboutUs-10\" rel=\"external_link\">[10]<\/a><\/sup>\n<\/p><p>In 2008, the National Center for Biotechnology Information (NCBI) scaled back its bioinformatics training program, creating a need for other groups to offer the training previously provided by the NCBI. The NIH Library, in keeping with its objective to support intramural research in genetics and bioinformatics more comprehensively, stepped in to fill that void by offering training specifically geared towards NIH investigators. \n<\/p><p>In February 2009, the NIH Library hired an expert bioinformatics trainer and consultant, Dr. Medha Bhagwat, to support bioinformatics research at NIH. Up to this point, the library did not offer bioinformatics support services. Dr. Bhagwat arrived from NCBI with 11 years of bioinformatics experience as well as diverse expertise in biochemistry and structural biology.\n<\/p><p>During her tenure at NCBI, Dr. Bhagwat developed and taught several two-hour mini-courses dealing with the effective use of specialized bioinformatics tools. These included \u201cquick start\u201d courses on analyzing microbial genomes, structural analysis, identification of disease genes, correlating disease genes and phenotypes, understanding DNA and protein sequences, and utilizing tools such as BLAST, Entrez Gene, MapViewer, and GenBank. Leveraging the courses and training she had previously developed at NCBI, Dr. Bhagwat was able to create classes tailored to the specific bioinformatics needs of the NIH intramural research community.<sup id=\"rdp-ebb-cite_ref-BhagwatMini06_11-0\" class=\"reference\"><a href=\"#cite_note-BhagwatMini06-11\" rel=\"external_link\">[11]<\/a><\/sup> Previous work as a bench scientist endowed her with an understanding of the needs and terminology particular to biomedical researchers. The fact that Dr. Bhagwat had been employed on the NIH campus since 1994 meant that she had also generated a strong internal network and was able to feel the pulse of the research community. These qualities combined to immediately make Dr. Bhagwat a valuable resource in her new role at the NIH Library.\n<\/p><p>Although Dr. Bhagwat had the expertise, experience, and training as a bioinformaticist, preliminary work was necessary to build a comprehensive bioinformatics support program. She began by researching bioinformatics support programs at prominent medical libraries and found that such programs include one or more of the following: instruction, licensing, computing software, collections, resource development such as online tutorials, and frameworks for collaborations among researchers. She then sought to identify the requirements of the NIH research community via a three-pronged approach: interviews with bioinformatics specialists at several NIH institutes, direct interaction with researchers during early training and consultation sessions, and a formal survey of NIH scientists. An initial bioinformatics support program was established, consisting of classroom training, one-on-one tutorials and consultation, online tutorials, software and database licenses, high-performance computers, and a collection of books, journals, and other literature.\n<\/p><p>Classroom training is taught by NIH Library staff as well as outside speakers, including subject and product experts supplied by bioinformatics software vendors. Most of the classroom instruction is provided in the library training room with additional live streaming over WebEx in some cases. Dr. Bhagwat formed strategic partnerships with several institutes to teach on-site training programs offered to extramural scientists, medical professionals, educators, and students at other facilities. These partnerships have helped expand the reach of the NIH Library\u2019s bioinformatics support program and have fostered a network of bioinformatics experts across campus. Examples include the National Institute of Nursing Research (NINR) Precision Health Boot Camp<sup id=\"rdp-ebb-cite_ref-NIHPrecision18_12-0\" class=\"reference\"><a href=\"#cite_note-NIHPrecision18-12\" rel=\"external_link\">[12]<\/a><\/sup> and the Summer Genetics Institute for nurses<sup id=\"rdp-ebb-cite_ref-NIHSGI18_13-0\" class=\"reference\"><a href=\"#cite_note-NIHSGI18-13\" rel=\"external_link\">[13]<\/a><\/sup>; the National Human Genome Research Institute (NHGRI) Short Course in Genomics<sup id=\"rdp-ebb-cite_ref-NIHNationalHuman18_14-0\" class=\"reference\"><a href=\"#cite_note-NIHNationalHuman18-14\" rel=\"external_link\">[14]<\/a><\/sup> for middle- and high-school teachers, community college, and tribal-college faculty; and the National Library of Medicine\u2019s (NLM) remote hands-on classes hosted by university libraries for academic researchers.<sup id=\"rdp-ebb-cite_ref-CDUBioinfo16_15-0\" class=\"reference\"><a href=\"#cite_note-CDUBioinfo16-15\" rel=\"external_link\">[15]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-UoMConnective13_16-0\" class=\"reference\"><a href=\"#cite_note-UoMConnective13-16\" rel=\"external_link\">[16]<\/a><\/sup> Dr. Bhagwat taught a two-credit course \u201cPractical Bioinformatics\u201d at the Foundation for Advanced Education Sciences (FAES) at NIH annually during the fall semester<sup id=\"rdp-ebb-cite_ref-FAES1516_17-0\" class=\"reference\"><a href=\"#cite_note-FAES1516-17\" rel=\"external_link\">[17]<\/a><\/sup>, and she gave lectures at Georgetown University as adjunct faculty and provided continuing education courses at both the Medical Library Association<sup id=\"rdp-ebb-cite_ref-MLA12Prelim12_18-0\" class=\"reference\"><a href=\"#cite_note-MLA12Prelim12-18\" rel=\"external_link\">[18]<\/a><\/sup> and Special Library Association conferences.<sup id=\"rdp-ebb-cite_ref-SLABiofeed10_19-0\" class=\"reference\"><a href=\"#cite_note-SLABiofeed10-19\" rel=\"external_link\">[19]<\/a><\/sup> The annual NIH Library Bioinformatics Research Symposium serves as a great example of a collaborative endeavor in which the Library organizes a two-day event featuring a series of scientific presentations highlighting practical applications of the analysis tools and databases licensed by the NIH Library for NIH researchers. The presenters are all scientists from NIH or relevant companies offering such bioinformatics tools.<sup id=\"rdp-ebb-cite_ref-NIHLibBioinfo_20-0\" class=\"reference\"><a href=\"#cite_note-NIHLibBioinfo-20\" rel=\"external_link\">[20]<\/a><\/sup>\n<\/p><p>Examples of bioinformatics classes led by Dr. Bhagwat at NIH include: Making Sense of DNA and Protein Sequences; Gene Resources: From Transcription Factor Binding Sites to Function; Sequence Similarity Search: BLAST-Like Alignment Tool (BLAT); Protein Structural Analysis: Binding Sites to Distant Homologs; Genome Browsers; Identification of Disease Genes; Correlation of Disease Genes to Phenotypes; Microbial Genome Analysis; Gene Expression Microarray Data Analysis; Next Gen Sequence Analysis; Gene Expression Omnibus; and Introduction to Clinical Genomics.\n<\/p><p>In addition, specific training is done by vendor-provided experts on the following proprietary bioinformatics software: CLC Biomedical Genomics Workbench, DNASTAR Lasergene, ArrayStar Qseq, SeqMan NGen, Metacore, MetaGeneMark, GeneIndexer, GeneSpring, Genomatix Genome Analyzer, Golden Helix SVS and VarSeq, Human Gene Mutation Database Professional, Ingenuity Pathways Analysis, Partek Genomics Suite, Pathway Studio, and ProteinLounge.\n<\/p><p>Depending on the software, the library provides online access via floating licenses or directly on three specialized bioinformatics workstations, two of which have identical specifications for typical high-throughput analysis: Windows 7 64-bit, 8 cores, 48 GB RAM, and 2 TB disk space. The third workstation is designed specifically to run CLC Genomics Workbench, an application for analyzing and visualizing <a href=\"https:\/\/www.limswiki.org\/index.php\/DNA_sequencing\" title=\"DNA sequencing\" target=\"_blank\" class=\"wiki-link\" data-key=\"7ff86b38049c37e30858efd13bd00925\">next-generation sequencing<\/a> (NGS) data. The specifications of this computer are more robust due to the demanding requirements of this sort of data analysis: Red Hat Enterprise Linux 6 64-bit, 28 cores, 512 GB RAM, and 24 TB disk space. However, even with these computing capabilities, the workstations often run overnight in order to complete such analyses. \n<\/p><p>In order to bolster support for the burgeoning bioinformatics program, a second staff member was hired in August 2010. Dr. Lynn Young has a PhD in physics, with computer programming experience as well as expertise in microarray and next-generation sequencing data analysis. Employing years of teaching experience, Dr. Bhagwat provides classroom instruction and organizes vendor-led instruction, while Dr. Young devotes more time to individual and small group consultations, either on the bioinformatics workstations or in her office. Due to her background in computer science and bioinformatics, Young is uniquely positioned to collaborate with NIH researchers by assisting with using software, writing scripts, and interpreting the results of complex analyses. When a researcher needs a tutorial before Dr. Young is available, she is able to refer them to a short video tutorial outlining the analysis of next-generation sequencing data using specific software and follow up later with an in-person meeting. Examples of tutorial and consultation topics include: download upstream gene sequence and identify transcription factor binding sites; gene set enrichment\/pathway analysis from microarray experiments; and next-gen sequence analysis using RNA-Seq, ChIP-Seq, and miRNASeq.\n<\/p><p>In response to the heavy demands of instruction and consultation, the Bioinformatics Workgroup was formed to handle some of the administrative functions of the program. This workgroup consists of library staff members who are not bioinformaticists but support the program in various ways; these support roles were realized by reallocating resources among existing NIH Library staff. Support activities include communicating with vendors; scheduling and keeping an up-to-date training calendar; organizing qualitative and quantitative data from testimonials and evaluation forms; and compiling statistics on classes, tutorials, off-site presentations, workstation reservations, software usage, and other metrics that feed into assessment of the program.\n<\/p><p>The most comprehensive formal assessment covers the 2016 calendar year in which 50 training sessions were provided to a total of 1,475 participants. The Bioinformatics Workgroup adjusts strategies for advertising and works with the library's Communication Workgroup to make such training available to the most attendees possible. For example, the group decided to raise the cap on registrants for each class and to publicize to people on the waiting list that, if they arrive early to class and sign in, they would be given any empty seat once the class began.\n<\/p><p>Table 1 shows a list of vendor-led training during 2016. This training for fee-based resources is typically provided as part of the library\u2019s subscription. It gives vendors an opportunity to promote their resources and enables the user community to gain targeted experience with specialized tools. \n<\/p>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table class=\"wikitable\" border=\"1\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" colspan=\"2\"><b>Table 1.<\/b> Vendor-led training\n<\/td><\/tr>\n<tr>\n<th style=\"padding-left:10px; padding-right:10px;\">Class\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">Attendees\n<\/th><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">GeneSpring 13.1\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">16\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Pathway Studio\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">9\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">GeneSpring 13.1\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">16\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Partek Flow\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">17\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Partek Genomics Suite\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">14\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Ingenuity Pathway Analysis\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">27\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Genomatix\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">19\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Pathway Studio\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">16\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">GeneSpring 13.1\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">16\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">QIAGEN Ingenuity Variant Analysis\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">12\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Partek Flow\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">20\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">QIAGEN CLC Genomics Workbench\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">24\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Pathway Studio\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">15\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Partek Genomics\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">14\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">MetaCore\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">12\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">GeneSpring 14.5\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">12\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">GeneSpring 14.5\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">15\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>Partnerships have been formed with other NIH institutes to provide training in library facilities, while library program staff also provide training for them at their own centers (see Table 2). For example, the National Cancer Institute (NCI) and the National Institute of Allergy and Infectious Diseases (NIAID) offered an exome sequencing analysis class in the library during 2016.\n<\/p>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table class=\"wikitable\" border=\"1\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" colspan=\"2\"><b>Table 2.<\/b> Strategic partner-led training\n<\/td><\/tr>\n<tr>\n<th style=\"padding-left:10px; padding-right:10px;\">Class\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">Attendees\n<\/th><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">RNA-Seq\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">34\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">OmicCircos\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">14\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">ChiP-Seq Analysis\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">27\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Exome Sequencing Analysis\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">33\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Pathway Analysis\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">16\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>And Table 3 represents classes led by NIH Library bioinformatics staff during 2016.\n<\/p>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table class=\"wikitable\" border=\"1\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" colspan=\"4\"><b>Table 3.<\/b> Staff-led training\n<\/td><\/tr>\n<tr>\n<th style=\"padding-left:10px; padding-right:10px;\">Class\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">Location\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">Attendees\n<\/th>\n<th style=\"padding-left:10px; padding-right:10px;\">Instructor\n<\/th><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Genome Browsers\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NIA, Baltimore\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">24\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Bhagwat\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">TCGA\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NIHL\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">20\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Bhagwat\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Introduction to Clinical Genomics\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NIHL\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">24\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Bhagwat\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Introduction to Clinical Genomics\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NLM and remote\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">40\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Bhagwat\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Gene Expression Omnibus\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NIHL\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">19\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Bhagwat\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Gene Resources\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Georgetown University\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">20\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Bhagwat\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Genome Browsers\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NIHL\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">14\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Bhagwat\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Pathway Analysis\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NICHD\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">25\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Bhagwat\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Gene Resources\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NIHL\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">27\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Bhagwat\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Sequence Analysis\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NIHL\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">26\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Bhagwat\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Bioinformatics Introduction to SQL\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NIHL\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">11\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Young\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NINR SGI Program\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NIHL\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">36\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Bhagwat\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Bioinformatics Symposium\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NIHL\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">320\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Bhagwat\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">BLAST\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NIHL\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">20\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Bhagwat\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NINR Boot Camp\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NIHL\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">170\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Bhagwat\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NHGRI Bio 101\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NIHL\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">25\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Bhagwat\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">BLAT\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NIHL\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">13\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Bhagwat\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Making Sense\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NIHL\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">20\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Bhagwat\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Gene Resources\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NIHL\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">20\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Bhagwat\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Genome Browser\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NIHL\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">20\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Bhagwat\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">GEO\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NIHL\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">20\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Bhagwat\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Clinical Genomics\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NIHL\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">20\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Bhagwat\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">BLAST\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NIHL\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">20\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Bhagwat\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">BLAT\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NIHL\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">20\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Bhagwat\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Next Gen\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NIHL\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">20\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Bhagwat\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">TCGA\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">NIHL\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">20\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Bhagwat\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>In order to use networked bioinformatics resources, NIH affiliates are required to register for access to a particular resource so that an individual account is created. The highest number of new registrations in 2016 were recorded for Ingenuity software; the National Cancer Institute had the most new registrants overall. A total of 524 reservations were made for the bioinformatics workstations. Workstation 2, the only workstation with CLC Genomics Workbench software used to align short sequence reads to a genome sequence (big data analysis), had the most reservations. Partek Genomics Suite was used the most on Workstation 1. Genomatix, Golden Helix SVS, and Pathway Studio represent the software reserved most frequently on Workstation 3. The National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) booked the most reservations of any institute in 2016.\n<\/p><p>Although quantitative data is useful in evaluating the program, researchers have often indicated the value of instruction and consultation by providing qualitative feedback. This is most often received by bioinformatics staff via email and surveys. Below is a one of many positive comments from a course participant:\n<\/p>\n<blockquote>Dear Medha,<br \/>\nYear after year, we put together an outstanding selection of speakers for Short course. Year after year\u2014against that background of excellence\u2014your bioinformatics workshop literally blows up the brains of our teachers. One super experienced teacher in the area of bioinformatics summed it all for me by simply saying \u201cthe best bioinformatics workshop I ever attended.\u201d Thank you for your commitment to our educational projects. You are truly a cornerstone of our course.<\/blockquote>\n<h2><span class=\"mw-headline\" id=\"Discussion\">Discussion<\/span><\/h2>\n<p>The NIH Library bioinformatics program has served more than 10,000 participants directly through classroom training and individual consultations since 2009. Drawing from the quantitative and qualitative data, it is clear that the NIH Library Bioinformatics Support Program is well-used and appreciated by researchers. However, in order to remain relevant, it is important to understand the evolving needs of the NIH research community. Based on the experience of bioinformatics staff, it is necessary to be in regular contact with NIH researchers as well as the larger bioinformatics and library communities. Focused conferences, seminars, and individual consultations with investigators offer excellent opportunities for keeping track of current trends. Within the NIH research community, consultations in particular afforded staff the best opportunity for understanding the topics, effective modes of training, and resources required by researchers for bioinformatics analysis. These interactions indicated that many users benefit from individualized training in ways that large group training and webinars cannot address. In this setting, bench scientists and clinicians are able to engage in substantive conversation with informationists, discuss ideas, and directly apply knowledge to real-world problems in real-time. Forming a network of bioinformatics experts throughout the NIH community has also been a key factor in the growth and success of the program.\n<\/p><p>In the coming years, as new biotechnologies emerge, staff must identify cutting edge trends and emerging needs and make modifications\u2014both qualitatively and quantitatively\u2014in certain aspects of the program. For example, more in-person classes may be needed to accommodate demand for this format as evidenced by feedback on evaluation forms. And more online offerings tailored towards the library community at large might be provided to reach a broader audience and enable learned application of general bioinformatics concepts using practical techniques. In regard to data infrastructure, storage, and analysis, staff will need to work closely with the NIH Library Information Architecture Branch to investigate the merits of <a href=\"https:\/\/www.limswiki.org\/index.php\/Cloud_computing\" title=\"Cloud computing\" target=\"_blank\" class=\"wiki-link\" data-key=\"fcfe5882eaa018d920cedb88398b604f\">cloud computing<\/a> versus high-performance workstations and associated servers supported by NIH, although reliable network speed is a potential limiting factor for moving in this direction. Government security in a networked environment is also a perennial concern, and the library must find comprehensive solutions for data backup and storage.\n<\/p><p>In July 2017, Dr. Bhagwat retired from the NIH Library. She was instrumental in creating the bioinformatics support program in 2009 and has been a cornerstone since that time. It remains to be seen whether her role can be filled by someone with the necessary experience, enthusiasm, and vision, not only to keep the program running, but to foster innovation and build on past successes. As with the NIH, institutions that have recruited individuals with advanced degrees in the biosciences into such roles have been able to create and sustain successful bioinformatics support programs.<sup id=\"rdp-ebb-cite_ref-ReinDevelop06_2-3\" class=\"reference\"><a href=\"#cite_note-ReinDevelop06-2\" rel=\"external_link\">[2]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-LiExpand13_4-1\" class=\"reference\"><a href=\"#cite_note-LiExpand13-4\" rel=\"external_link\">[4]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-YarfitzALib00_5-1\" class=\"reference\"><a href=\"#cite_note-YarfitzALib00-5\" rel=\"external_link\">[5]<\/a><\/sup> While it takes a leader to spearhead such an endeavor, a dedicated support team is necessary to handle some of the administrative aspects such as scheduling, promotion, and data collection. In this way, subject experts can devote more of their time to directly assisting researchers.\n<\/p><p>The NIH Library Bioinformatics Support Program has grown to encompass staff and vendor-led classes, in-person consultations, online tutorials, high-performance workstations, analysis tools and databases, and other curated bioinformatics resources.<sup id=\"rdp-ebb-cite_ref-NIHLibBioinfo_20-1\" class=\"reference\"><a href=\"#cite_note-NIHLibBioinfo-20\" rel=\"external_link\">[20]<\/a><\/sup> As this program evolves, the NIH Library strives to provide a dynamic and valuable suite of bioinformatics services to NIH and the larger medical research community well into the future.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Acknowledgements\">Acknowledgements<\/span><\/h2>\n<p>The author would like to thank Dr. Medha Bhagwat for supplying much of the information regarding the details of the bioinformatics program, Dr. Lynn Young for leading the initiative to evaluate the NIH Library Bioinformatics Program, and Lisa Federer for proposing that such a case study be written.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Disclosure\">Disclosure<\/span><\/h2>\n<p>The author reports no conflict of interest. Products named are for informational purposes only. The NIH Library does not endorse specific software or databases. \n<\/p>\n<h2><span class=\"mw-headline\" id=\"References\">References<\/span><\/h2>\n<div class=\"reflist references-column-width\" style=\"-moz-column-width: 30em; -webkit-column-width: 30em; column-width: 30em; list-style-type: decimal;\">\n<ol class=\"references\">\n<li id=\"cite_note-CanIntro14-1\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CanIntro14_1-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Can, T. (2014). \"Introduction to Bioinformatics\". In Yousef, M.; Allmer, J.. <i>miRNomics: MicroRNA Biology and Computational Analysis<\/i>. Methods in Molecular Biology. <b>1107<\/b>. Humana Press. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1007%2F978-1-62703-748-8_4\" target=\"_blank\">10.1007\/978-1-62703-748-8_4<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a> 9781627037488.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Introduction+to+Bioinformatics&rft.atitle=miRNomics%3A+MicroRNA+Biology+and+Computational+Analysis&rft.aulast=Can%2C+T.&rft.au=Can%2C+T.&rft.date=2014&rft.series=Methods+in+Molecular+Biology&rft.volume=1107&rft.pub=Humana+Press&rft_id=info:doi\/10.1007%2F978-1-62703-748-8_4&rft.isbn=9781627037488&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_bioinformatics_program_and_supporting_infrastructure_in_a_biomedical_library\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ReinDevelop06-2\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-ReinDevelop06_2-0\" rel=\"external_link\">2.0<\/a><\/sup> <sup><a href=\"#cite_ref-ReinDevelop06_2-1\" rel=\"external_link\">2.1<\/a><\/sup> <sup><a href=\"#cite_ref-ReinDevelop06_2-2\" rel=\"external_link\">2.2<\/a><\/sup> <sup><a href=\"#cite_ref-ReinDevelop06_2-3\" rel=\"external_link\">2.3<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Rein, D.C. (2006). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC1525331\" target=\"_blank\">\"Developing library bioinformatics services in context: The Purdue University Libraries bioinformationist program\"<\/a>. <i>Journal of the Medical Library Association<\/i> <b>94<\/b> (3): 314\u201320. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC1525331\/\" target=\"_blank\">PMC1525331<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/16888666\" target=\"_blank\">16888666<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC1525331\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC1525331<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Developing+library+bioinformatics+services+in+context%3A+The+Purdue+University+Libraries+bioinformationist+program&rft.jtitle=Journal+of+the+Medical+Library+Association&rft.aulast=Rein%2C+D.C.&rft.au=Rein%2C+D.C.&rft.date=2006&rft.volume=94&rft.issue=3&rft.pages=314%E2%80%9320&rft_id=info:pmc\/PMC1525331&rft_id=info:pmid\/16888666&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC1525331&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_bioinformatics_program_and_supporting_infrastructure_in_a_biomedical_library\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-SchneiderBioinfo10-3\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-SchneiderBioinfo10_3-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Schneider, M.V.; Watson, J.; Attwood, T. et al. (2010). \"Bioinformatics training: A review of challenges, actions and support requirements\". <i>Briefings in Bioinformatics<\/i> <b>11<\/b> (6): 544\u201351. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1093%2Fbib%2Fbbq021\" target=\"_blank\">10.1093\/bib\/bbq021<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/20562256\" target=\"_blank\">20562256<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Bioinformatics+training%3A+A+review+of+challenges%2C+actions+and+support+requirements&rft.jtitle=Briefings+in+Bioinformatics&rft.aulast=Schneider%2C+M.V.%3B+Watson%2C+J.%3B+Attwood%2C+T.+et+al.&rft.au=Schneider%2C+M.V.%3B+Watson%2C+J.%3B+Attwood%2C+T.+et+al.&rft.date=2010&rft.volume=11&rft.issue=6&rft.pages=544%E2%80%9351&rft_id=info:doi\/10.1093%2Fbib%2Fbbq021&rft_id=info:pmid\/20562256&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_bioinformatics_program_and_supporting_infrastructure_in_a_biomedical_library\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LiExpand13-4\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-LiExpand13_4-0\" rel=\"external_link\">4.0<\/a><\/sup> <sup><a href=\"#cite_ref-LiExpand13_4-1\" rel=\"external_link\">4.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Li, M.; Chen, Y.B.; Clintworth, W.A. (2013). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3794686\" target=\"_blank\">\"Expanding roles in a library-based bioinformatics service program: A case study\"<\/a>. <i>Journal of the Medical Library Association<\/i> <b>101<\/b> (4): 303\u20139. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.3163%2F1536-5050.101.4.012\" target=\"_blank\">10.3163\/1536-5050.101.4.012<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3794686\/\" target=\"_blank\">PMC3794686<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/24163602\" target=\"_blank\">24163602<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3794686\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3794686<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Expanding+roles+in+a+library-based+bioinformatics+service+program%3A+A+case+study&rft.jtitle=Journal+of+the+Medical+Library+Association&rft.aulast=Li%2C+M.%3B+Chen%2C+Y.B.%3B+Clintworth%2C+W.A.&rft.au=Li%2C+M.%3B+Chen%2C+Y.B.%3B+Clintworth%2C+W.A.&rft.date=2013&rft.volume=101&rft.issue=4&rft.pages=303%E2%80%939&rft_id=info:doi\/10.3163%2F1536-5050.101.4.012&rft_id=info:pmc\/PMC3794686&rft_id=info:pmid\/24163602&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3794686&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_bioinformatics_program_and_supporting_infrastructure_in_a_biomedical_library\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-YarfitzALib00-5\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-YarfitzALib00_5-0\" rel=\"external_link\">5.0<\/a><\/sup> <sup><a href=\"#cite_ref-YarfitzALib00_5-1\" rel=\"external_link\">5.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Yarfitz, S.; Ketchell, D.S. (2000). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC35196\" target=\"_blank\">\"A library-based bioinformatics services program\"<\/a>. <i>Bulletin of the Medical Library Association<\/i> <b>88<\/b> (1): 36\u201348. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC35196\/\" target=\"_blank\">PMC35196<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/10658962\" target=\"_blank\">10658962<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC35196\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC35196<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+library-based+bioinformatics+services+program&rft.jtitle=Bulletin+of+the+Medical+Library+Association&rft.aulast=Yarfitz%2C+S.%3B+Ketchell%2C+D.S.&rft.au=Yarfitz%2C+S.%3B+Ketchell%2C+D.S.&rft.date=2000&rft.volume=88&rft.issue=1&rft.pages=36%E2%80%9348&rft_id=info:pmc\/PMC35196&rft_id=info:pmid\/10658962&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC35196&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_bioinformatics_program_and_supporting_infrastructure_in_a_biomedical_library\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-DavidoffTheInfo00-6\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-DavidoffTheInfo00_6-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Davidoff, F.; Florance, V. (2000). \"The informationist: A new health profession?\". <i>Annals of Internal Medicine<\/i> <b>132<\/b> (12): 996\u20138. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.7326%2F0003-4819-132-12-200006200-00012\" target=\"_blank\">10.7326\/0003-4819-132-12-200006200-00012<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/10858185\" target=\"_blank\">10858185<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+informationist%3A+A+new+health+profession%3F&rft.jtitle=Annals+of+Internal+Medicine&rft.aulast=Davidoff%2C+F.%3B+Florance%2C+V.&rft.au=Davidoff%2C+F.%3B+Florance%2C+V.&rft.date=2000&rft.volume=132&rft.issue=12&rft.pages=996%E2%80%938&rft_id=info:doi\/10.7326%2F0003-4819-132-12-200006200-00012&rft_id=info:pmid\/10858185&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_bioinformatics_program_and_supporting_infrastructure_in_a_biomedical_library\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HelmsBioinfo04-7\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HelmsBioinfo04_7-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Helms, A.J.; Bradford, K.D.; Warren, N.J.; Schwartz, D.G. (2004). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC521520\" target=\"_blank\">\"Bioinformatics opportunities for health sciences librarians and information professionals\"<\/a>. <i>Journal of the Medical Library Association<\/i> <b>92<\/b> (4): 489\u201393. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC521520\/\" target=\"_blank\">PMC521520<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/15494764\" target=\"_blank\">15494764<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC521520\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC521520<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Bioinformatics+opportunities+for+health+sciences+librarians+and+information+professionals&rft.jtitle=Journal+of+the+Medical+Library+Association&rft.aulast=Helms%2C+A.J.%3B+Bradford%2C+K.D.%3B+Warren%2C+N.J.%3B+Schwartz%2C+D.G.&rft.au=Helms%2C+A.J.%3B+Bradford%2C+K.D.%3B+Warren%2C+N.J.%3B+Schwartz%2C+D.G.&rft.date=2004&rft.volume=92&rft.issue=4&rft.pages=489%E2%80%9393&rft_id=info:pmc\/PMC521520&rft_id=info:pmid\/15494764&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC521520&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_bioinformatics_program_and_supporting_infrastructure_in_a_biomedical_library\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GeerBroad06-8\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-GeerBroad06_8-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Helms, A.J.; Bradford, K.D.; Warren, N.J.; Schwartz, D.G. (2004). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC521520\" target=\"_blank\">\"Bioinformatics opportunities for health sciences librarians and information professionals\"<\/a>. <i>Journal of the Medical Library Association<\/i> <b>92<\/b> (4): 489\u201393. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC521520\/\" target=\"_blank\">PMC521520<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/15494764\" target=\"_blank\">15494764<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC521520\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC521520<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Bioinformatics+opportunities+for+health+sciences+librarians+and+information+professionals&rft.jtitle=Journal+of+the+Medical+Library+Association&rft.aulast=Helms%2C+A.J.%3B+Bradford%2C+K.D.%3B+Warren%2C+N.J.%3B+Schwartz%2C+D.G.&rft.au=Helms%2C+A.J.%3B+Bradford%2C+K.D.%3B+Warren%2C+N.J.%3B+Schwartz%2C+D.G.&rft.date=2004&rft.volume=92&rft.issue=4&rft.pages=489%E2%80%9393&rft_id=info:pmc\/PMC521520&rft_id=info:pmid\/15494764&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC521520&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_bioinformatics_program_and_supporting_infrastructure_in_a_biomedical_library\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LyonCarving06-9\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-LyonCarving06_9-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Lyon, J.A.; Tennant, M.R.; Messner, K.R.; Osterbur, D.L. (2006). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC1525329\" target=\"_blank\">\"Carving a niche: Establishing bioinformatics collaborations\"<\/a>. <i>Journal of the Medical Library Association<\/i> <b>94<\/b> (3): 330\u20135. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC1525329\/\" target=\"_blank\">PMC1525329<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/16888668\" target=\"_blank\">16888668<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC1525329\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC1525329<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Carving+a+niche%3A+Establishing+bioinformatics+collaborations&rft.jtitle=Journal+of+the+Medical+Library+Association&rft.aulast=Lyon%2C+J.A.%3B+Tennant%2C+M.R.%3B+Messner%2C+K.R.%3B+Osterbur%2C+D.L.&rft.au=Lyon%2C+J.A.%3B+Tennant%2C+M.R.%3B+Messner%2C+K.R.%3B+Osterbur%2C+D.L.&rft.date=2006&rft.volume=94&rft.issue=3&rft.pages=330%E2%80%935&rft_id=info:pmc\/PMC1525329&rft_id=info:pmid\/16888668&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC1525329&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_bioinformatics_program_and_supporting_infrastructure_in_a_biomedical_library\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-NIHAboutUs-10\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-NIHAboutUs_10-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/nihlibrary.nih.gov\/about-us\" target=\"_blank\">\"About Us\"<\/a>. <i>NIH Library<\/i>. National Institutes of Health<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/nihlibrary.nih.gov\/about-us\" target=\"_blank\">https:\/\/nihlibrary.nih.gov\/about-us<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 09 March 2018<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=About+Us&rft.atitle=NIH+Library&rft.pub=National+Institutes+of+Health&rft_id=https%3A%2F%2Fnihlibrary.nih.gov%2Fabout-us&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_bioinformatics_program_and_supporting_infrastructure_in_a_biomedical_library\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BhagwatMini06-11\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BhagwatMini06_11-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Bhagwat, M.; Wheeler, D.; Valjavec-Gratian, M. (2006). <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.ncbi.nlm.nih.gov\/Class\/minicourses\/\" target=\"_blank\">\"Mini Courses\"<\/a>. National Center for Biotechnology Information<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.ncbi.nlm.nih.gov\/Class\/minicourses\/\" target=\"_blank\">https:\/\/www.ncbi.nlm.nih.gov\/Class\/minicourses\/<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Mini+Courses&rft.atitle=&rft.aulast=Bhagwat%2C+M.%3B+Wheeler%2C+D.%3B+Valjavec-Gratian%2C+M.&rft.au=Bhagwat%2C+M.%3B+Wheeler%2C+D.%3B+Valjavec-Gratian%2C+M.&rft.date=2006&rft.pub=National+Center+for+Biotechnology+Information&rft_id=https%3A%2F%2Fwww.ncbi.nlm.nih.gov%2FClass%2Fminicourses%2F&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_bioinformatics_program_and_supporting_infrastructure_in_a_biomedical_library\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-NIHPrecision18-12\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-NIHPrecision18_12-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.ninr.nih.gov\/training\/trainingopportunitiesintramural\/bootcamp\" target=\"_blank\">\"NINR \"Precision Health: Smart Technologies, Smart Health\u201d Boot Camp\"<\/a>. <i>National Institute of Nursing Research<\/i>. National Institutes of Health<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.ninr.nih.gov\/training\/trainingopportunitiesintramural\/bootcamp\" target=\"_blank\">https:\/\/www.ninr.nih.gov\/training\/trainingopportunitiesintramural\/bootcamp<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 09 March 2018<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=NINR+%22Precision+Health%3A+Smart+Technologies%2C+Smart+Health%E2%80%9D+Boot+Camp&rft.atitle=National+Institute+of+Nursing+Research&rft.pub=National+Institutes+of+Health&rft_id=https%3A%2F%2Fwww.ninr.nih.gov%2Ftraining%2Ftrainingopportunitiesintramural%2Fbootcamp&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_bioinformatics_program_and_supporting_infrastructure_in_a_biomedical_library\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-NIHSGI18-13\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-NIHSGI18_13-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.ninr.nih.gov\/training\/trainingopportunitiesintramural\/summergeneticsinstitute\" target=\"_blank\">\"Summer Genetics Institute (SGI)\"<\/a>. <i>National Institute of Nursing Research<\/i>. National Institutes of Health<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.ninr.nih.gov\/training\/trainingopportunitiesintramural\/summergeneticsinstitute\" target=\"_blank\">https:\/\/www.ninr.nih.gov\/training\/trainingopportunitiesintramural\/summergeneticsinstitute<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 09 March 2018<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Summer+Genetics+Institute+%28SGI%29&rft.atitle=National+Institute+of+Nursing+Research&rft.pub=National+Institutes+of+Health&rft_id=https%3A%2F%2Fwww.ninr.nih.gov%2Ftraining%2Ftrainingopportunitiesintramural%2Fsummergeneticsinstitute&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_bioinformatics_program_and_supporting_infrastructure_in_a_biomedical_library\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-NIHNationalHuman18-14\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-NIHNationalHuman18_14-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.genome.gov\/10000217\/nhgri-short-course-in-genomics\/\" target=\"_blank\">\"National Human Genome Research Institute Short Course in Genomics\"<\/a>. <i>National Human Genome Research Institute<\/i>. National Institutes of Health<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.genome.gov\/10000217\/nhgri-short-course-in-genomics\/\" target=\"_blank\">https:\/\/www.genome.gov\/10000217\/nhgri-short-course-in-genomics\/<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 09 March 2018<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=National+Human+Genome+Research+Institute+Short+Course+in+Genomics&rft.atitle=National+Human+Genome+Research+Institute&rft.pub=National+Institutes+of+Health&rft_id=https%3A%2F%2Fwww.genome.gov%2F10000217%2Fnhgri-short-course-in-genomics%2F&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_bioinformatics_program_and_supporting_infrastructure_in_a_biomedical_library\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CDUBioinfo16-15\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CDUBioinfo16_15-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.cdrewu.edu\/CDUNewsletters\/activenews_view.asp?articleID=719\" target=\"_blank\">\"Bioinformatics: Clinical Genomics Subject of Mini Course\"<\/a>. <i>CDU Newsletter<\/i>. Charles R. Drew University of Medicine and Science. 01 April 2016<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.cdrewu.edu\/CDUNewsletters\/activenews_view.asp?articleID=719\" target=\"_blank\">https:\/\/www.cdrewu.edu\/CDUNewsletters\/activenews_view.asp?articleID=719<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Bioinformatics%3A+Clinical+Genomics+Subject+of+Mini+Course&rft.atitle=CDU+Newsletter&rft.date=01+April+2016&rft.pub=Charles+R.+Drew+University+of+Medicine+and+Science&rft_id=https%3A%2F%2Fwww.cdrewu.edu%2FCDUNewsletters%2Factivenews_view.asp%3FarticleID%3D719&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_bioinformatics_program_and_supporting_infrastructure_in_a_biomedical_library\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-UoMConnective13-16\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-UoMConnective13_16-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">University of Maryland Health Sciences and Human Services Library (2013). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www2.hshsl.umaryland.edu\/newsletter\/?p=1434\" target=\"_blank\">\"PubChem Training from NLM\"<\/a>. <i>Connective Issues<\/i> <b>7<\/b> (4)<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www2.hshsl.umaryland.edu\/newsletter\/?p=1434\" target=\"_blank\">http:\/\/www2.hshsl.umaryland.edu\/newsletter\/?p=1434<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=PubChem+Training+from+NLM&rft.jtitle=Connective+Issues&rft.aulast=University+of+Maryland+Health+Sciences+and+Human+Services+Library&rft.au=University+of+Maryland+Health+Sciences+and+Human+Services+Library&rft.date=2013&rft.volume=7&rft.issue=4&rft_id=http%3A%2F%2Fwww2.hshsl.umaryland.edu%2Fnewsletter%2F%3Fp%3D1434&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_bioinformatics_program_and_supporting_infrastructure_in_a_biomedical_library\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-FAES1516-17\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-FAES1516_17-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/faes.org\/sites\/default\/files\/files\/FAES%20Catalog%202015-16%20FINAL.pdf\" target=\"_blank\">\"2015\u20132016 Catalog of Courses and Student Handbook\"<\/a> (PDF). Foundation for Advanced Education in the Sciences. 2015. p. 24<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/faes.org\/sites\/default\/files\/files\/FAES%20Catalog%202015-16%20FINAL.pdf\" target=\"_blank\">https:\/\/faes.org\/sites\/default\/files\/files\/FAES%20Catalog%202015-16%20FINAL.pdf<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=2015%E2%80%932016+Catalog+of+Courses+and+Student+Handbook&rft.atitle=&rft.date=2015&rft.pages=p.+24&rft.pub=Foundation+for+Advanced+Education+in+the+Sciences&rft_id=https%3A%2F%2Ffaes.org%2Fsites%2Fdefault%2Ffiles%2Ffiles%2FFAES%2520Catalog%25202015-16%2520FINAL.pdf&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_bioinformatics_program_and_supporting_infrastructure_in_a_biomedical_library\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MLA12Prelim12-18\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-MLA12Prelim12_18-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.mlanet.org\/d\/do\/1854\" target=\"_blank\">\"MLA '12 Preliminary Program\"<\/a> (PDF). Medical Library Association. 2012. p. 15<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.mlanet.org\/d\/do\/1854\" target=\"_blank\">http:\/\/www.mlanet.org\/d\/do\/1854<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=MLA+%2712+Preliminary+Program&rft.atitle=&rft.date=2012&rft.pages=p.+15&rft.pub=Medical+Library+Association&rft_id=http%3A%2F%2Fwww.mlanet.org%2Fd%2Fdo%2F1854&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_bioinformatics_program_and_supporting_infrastructure_in_a_biomedical_library\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-SLABiofeed10-19\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-SLABiofeed10_19-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Hooper-Lane, C. (2010). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dbiosla.org\/publications\/pubs\/biofeedback\/Spring2010.pdf\" target=\"_blank\">\"2010 Conference Program Preview\"<\/a> (PDF). <i>Biofeedback<\/i> <b>35<\/b> (2): 2<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/dbiosla.org\/publications\/pubs\/biofeedback\/Spring2010.pdf\" target=\"_blank\">http:\/\/dbiosla.org\/publications\/pubs\/biofeedback\/Spring2010.pdf<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=2010+Conference+Program+Preview&rft.jtitle=Biofeedback&rft.aulast=Hooper-Lane%2C+C.&rft.au=Hooper-Lane%2C+C.&rft.date=2010&rft.volume=35&rft.issue=2&rft.pages=2&rft_id=http%3A%2F%2Fdbiosla.org%2Fpublications%2Fpubs%2Fbiofeedback%2FSpring2010.pdf&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_bioinformatics_program_and_supporting_infrastructure_in_a_biomedical_library\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-NIHLibBioinfo-20\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-NIHLibBioinfo_20-0\" rel=\"external_link\">20.0<\/a><\/sup> <sup><a href=\"#cite_ref-NIHLibBioinfo_20-1\" rel=\"external_link\">20.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/nihlibrary.nih.gov\/services\/bioinformatics-support\" target=\"_blank\">\"Bioinformatics Support Program\"<\/a>. <i>NIH Library<\/i>. National Institutes of Health<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/nihlibrary.nih.gov\/services\/bioinformatics-support\" target=\"_blank\">https:\/\/nihlibrary.nih.gov\/services\/bioinformatics-support<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 09 March 2018<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Bioinformatics+Support+Program&rft.atitle=NIH+Library&rft.pub=National+Institutes+of+Health&rft_id=https%3A%2F%2Fnihlibrary.nih.gov%2Fservices%2Fbioinformatics-support&rfr_id=info:sid\/en.wikipedia.org:Journal:Developing_a_bioinformatics_program_and_supporting_infrastructure_in_a_biomedical_library\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<\/ol><\/div>\n<h2><span class=\"mw-headline\" id=\"Notes\">Notes<\/span><\/h2>\n<p>This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added. The original article lists references in alphabetical order, by author; this version lists them in order of appearance, by design.\n<\/p>\n<!-- \nNewPP limit report\nCached time: 20181214185731\nCache expiry: 86400\nDynamic content: false\nCPU time usage: 0.484 seconds\nReal time usage: 0.506 seconds\nPreprocessor visited node count: 15781\/1000000\nPreprocessor generated node count: 34918\/1000000\nPost\u2010expand include size: 118603\/2097152 bytes\nTemplate argument size: 40364\/2097152 bytes\nHighest expansion depth: 18\/40\nExpensive parser function count: 0\/100\n-->\n\n<!-- \nTransclusion expansion time report (%,ms,calls,template)\n100.00% 482.887 1 - -total\n 81.82% 395.096 1 - Template:Reflist\n 69.96% 337.830 20 - Template:Citation\/core\n 39.10% 188.792 10 - Template:Cite_journal\n 26.87% 129.746 9 - Template:Cite_web\n 12.88% 62.212 1 - Template:Infobox_journal_article\n 12.36% 59.673 1 - Template:Infobox\n 9.27% 44.741 1 - Template:Cite_book\n 7.42% 35.826 80 - Template:Infobox\/row\n 5.92% 28.588 19 - Template:Citation\/identifier\n-->\n\n<!-- Saved in parser cache with key limswiki:pcache:idhash:10476-0!*!0!!en!*!* and timestamp 20181214185730 and revision id 32743\n -->\n<\/div><div class=\"printfooter\">Source: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Developing_a_bioinformatics_program_and_supporting_infrastructure_in_a_biomedical_library\">https:\/\/www.limswiki.org\/index.php\/Journal:Developing_a_bioinformatics_program_and_supporting_infrastructure_in_a_biomedical_library<\/a><\/div>\n\t\t\t\t\t\t\t\t\t\t<!-- end content -->\n\t\t\t\t\t\t\t\t\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t\t<!-- end of the left (by default at least) column -->\n\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t\t\n\t\t<\/div>\n\t\t\n\n<\/body>","a3349d5e1cf1d4519948fbbbfffe0deb_images":[],"a3349d5e1cf1d4519948fbbbfffe0deb_timestamp":1544813850,"ec047b57c5e01fb4daaaffc7b376efce_type":"article","ec047b57c5e01fb4daaaffc7b376efce_title":"Big data management for cloud-enabled geological information services (Zhu et al. 2018)","ec047b57c5e01fb4daaaffc7b376efce_url":"https:\/\/www.limswiki.org\/index.php\/Journal:Big_data_management_for_cloud-enabled_geological_information_services","ec047b57c5e01fb4daaaffc7b376efce_plaintext":"\n\n\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\n\n\t\t\t\tJournal:Big data management for cloud-enabled geological information services\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tFrom LIMSWiki\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\tJump to: navigation, search\n\n\t\t\t\t\t\n\t\t\t\t\tFull article title\n \nBig data management for cloud-enabled geological information servicesJournal\n \nScientific ProgrammingAuthor(s)\n \nZhu, Yueqin; Tan, Yongjie; Luo, Xiong; He, ZhijieAuthor affiliation(s)\n \nChina Geological Survey, Ministry of Land and Resources, University of Science and\r\nTechnology Beijing, Beijing Key Laboratory of Knowledge Engineering for Materials ScienceEditors\n \nLiu, A.Year published\n \n2018Volume and issue\n \n2018(2018)Page(s)\n \n1327214DOI\n \n10.1155\/2018\/1327214ISSN\n \n1875-919XDistribution license\n \nCreative Commons Attribution 4.0 InternationalWebsite\n \nhttps:\/\/www.hindawi.com\/journals\/sp\/2018\/1327214\/Download\n \nhttp:\/\/downloads.hindawi.com\/journals\/sp\/2018\/1327214.pdf (PDF)\n\nContents\n\n1 Abstract \n2 Introduction \n3 Review on cloud-enabled geological information services \n\n3.1 System architecture \n3.2 Requirement from big data management \n\n\n4 Challenges for big data management in cloud-enabled geological information services \n\n4.1 Volume \n4.2 Variety \n4.3 Velocity \n4.4 Veracity \n\n\n5 Key technologies and trends on big data management in cloud-enabled geological information services (CEGIS) \n\n5.1 Geological big data collection and preprocessing \n\n5.1.1 Geological data collection access \n5.1.2 Quality and usability characteristics of geological data \n5.1.3 Geological data entity recognition model \n5.1.4 Aggregation of geological big data \n5.1.5 Management of geological big data evolution tracking records \n\n\n5.2 Geological big data storage and management \n5.3 Geological big data analysis and mining \n\n5.3.1 Geological big data analysis \n5.3.2 Geological big data mining \n\n\n5.4 Highly performable big data cloud computing platform \n5.5 Applications of geological big data technologies \n\n5.5.1 Exploration of metallogenic law \n5.5.2 Smart prospecting \n5.5.3 Service of people\u2019s livelihood geology \n5.5.4 Application of knowledge visualization service \n\n\n\n\n6 Conclusion \n7 Conflicts of interest \n8 Acknowledgments \n9 References \n10 Notes \n\n\n\nAbstract \nCloud computing as a powerful technology of performing massive-scale and complex computing plays an important role in implementing geological information services. In the era of big data, data are being collected at an unprecedented scale. Therefore, to ensure successful data processing and analysis in cloud-enabled geological information services (CEGIS), we must address the challenging and time-demanding task of big data processing. This review starts by elaborating the system architecture and the requirements for big data management. This is followed by the analysis of the application requirements and technical challenges of big data management for CEGIS in China. This review also presents the application development opportunities and technical trends of big data management in CEGIS, including collection and preprocessing, storage and management, analysis and mining, parallel computing-based cloud platforms, and technology applications.\n\nIntroduction \nIn the era of big data, the data-driven modeling method enables us to exploit the potential of massive amounts of geological data easily.[1][2][3] In particular, by mining the data scientifically, one can offer new services that bring higher value to customers. Furthermore, it is now possible to implement the transition from digital geology to intelligent geology by integrating multiple systems in geological research through the use of big data and other technologies.[4]\nThe application of geological data management in the cloud makes it possible to fully utilize structured and unstructured data, including geology, minerals, geophysics, geochemistry, remote sensing, terrain, topography, vegetation, architecture, hydrology, disasters, and other digital geological data distributed in every place on the surface of the earth.[4][5] Moreover, the geological cloud will enable the integration of data collection, resource integration, data transmission, information extraction, and knowledge mining, which will pave the way for the transition from data to information, from information to knowledge, and from knowledge to wisdom. In addition, it supports data analysis, mining, organization, and management services for the scientific management of land resources, prospecting breakthrough strategic action and social services, while conducting multilevel, multiangle, and multiobjective demonstration applications on geological data for government decision-making, scientific research, and public services.[5]\nBig data technologies are bringing unprecedented opportunities and challenges to various application areas, especially to geological information processing.[2][6][7] Under these circumstances, there are some advancements achieved in the development of this area.[8][9] Furthermore, from various disciplines of science and engineering, there has been a growing interest in this research field related to geological data generated in the geological information services (GIS). We analyzed the number of those documents indexed in the \u201cWeb of Science\u201d research database.[10] In Figures 1 and 2, we can easily find that, in the past ten years, the number of those documents in which \u201cgeological data\u201d is in the title and in the topic is increasing, respectively. Hence, geological data analysis in GIS is an interesting and important research topic currently.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 1. The trend of the number of documents in which \u201cgeological data\u201d is in the title, from 2007 to 2016\n\n\n\n\n\n\n\n\n\n\n\n\n Figure 2. The trend of the number of documents in which \u201cgeological data\u201d is in the topic, from 2007 to 2016.\n\n\n\nConsidering the development status of cloud-enabled geological information services (CEGIS) and the application requirements of big data management analysis, this article describes the significant impact and revolution on GIS brought by the advancement of big data technologies. Furthermore, this article outlines the future application development and technology development trend of big data management analysis in CEGIS.\nThe remainder of this article is organized as follows. In the next section we provide a review on CEGIS, with an emphasis on the descriptions for the system architecture and those requirements from big data management. Then, the challenges for big data management in CEGIS are presented. The key technologies and trends on big data management in CEGIS are analyzed afterwards, and finally we draw conclusions from the research.\n\nReview on cloud-enabled geological information services \nThe construction of a geological cloud differs from the current big data analysis based on the internet of things (IoT). Having a deep understanding of data characteristics is necessary to collect, process, analyze, and interpret data in different fields, because the nature and types of data vary in different fields and in different problems. Geology is a data-intensive science, and geological data are characterized with multisource heterogeneity, spatiotemporal variation, correlation, uncertainty, fuzziness, and nonlinearity. Therefore, the geological cloud has a certain degree of confidentiality and it is highly domain-specific; meanwhile, it is developed on the basis of a large amount of geological data accumulated over a long period of time.[5][11] There are many real-time data generated from geological disasters and the geological environment. The geological cloud includes core basic data, which can be divided into three parts: an existing structured database, some unstructured data, and public application data. Therefore, it is important to take good advantage of the existing traditional structured data, use the big data technologies to deal with the relevant unstructured data, and also consider the peripheral public data.\nGeological big data are multidimensional, and they consist of both structured and unstructured data.[12] The technical methods of big data analysis differ greatly from those of professional databases. Long-term geological survey and study have yeilded years of geological information, forming a rich and professional database, which is an important fundamental assurance for land and resources science management, geological survey, and geological information public service.[13] This \u201cprofessional cloud\u201d objectively requires technology research and development, such as the construction of a professional local area network, a data sharing platform, and geological big data visualization services. Hence, the construction of a geological cloud service is closely related to land resource management, deployment decisions, and the application demand of public service. The key technologies of research and development include the following: unstructured data extraction and mining analysis, structured and unstructured data mixed storage and management, big data sharing platform, data transmission, and visualization.[11]\nGenerally, the construction of a geological cloud is a long-term systematic project. This means that it is required to follow the basic principles of \u201cstanding on the reality, focusing on the future\u201d and \u201cfocusing on the long-term and overall situation, embarking on the current and local situation,\u201d in order to achieve the analysis and application of geological cloud public data and core data gradually in accordance with the technical route of big data analysis; thus the construction of a geological cloud will be implemented eventually. For the earth, land and resource management should cover many respects, including human behavior, climate change, development and utilization of various resources, natural disasters, environmental pollution, and the ecosystem cycle. Then the introduction of big data technologies can integrate this type of resource information to provide the ability of uniformly dealing with the problems related to the entire earth information resources, which has a significant effect on the strategic planning of land and resource management.[3]\nThe geological cloud is an important component of the scientific process for geological data research. The ultimate goal for developing a geological cloud is to better describe and understand the complex earth system and geological framework, provide the scientific basis for the description of the land surface and the biodiversity characteristics of the earth, and improve the ability to deal with complex social problems.\n\nSystem architecture \nBecause the business service functions of each country differ, the system architecture of the geological cloud also will vary. In Figure 3, we present a system architecture, using China as an example.[14]\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 3. The system architecture of a geological cloud\n\n\n\nThe geological cloud combines the geological survey intranet and the geological survey extranet. It enables the sharing and management of computing resources, storage resources, network resources, software resources, and geological data resources.[15]\nThe geological cloud can be summarized as having the following characteristics[14]:\n(i) \u201cOne platform: The geological cloud management platform\u201d: It uniformly manages computing resources, storage resources, network resources, software resources, and geological data resources.\n(ii) \u201cTwo networks: The geological survey intranet and the geological survey extranet\u201d: Here, the intranet is constructed by creating a network that is physically isolated from the internet. The intranet is developed on the basis of the existing geological survey network, and each node is linked through a dedicated line or bare fiber. All of the internal business management systems, software systems, and data are deployed on the internet, providing services to 28 local units and those users of more than 350 geological survey projects. Facilitated by the public geological survey network, the geological survey business management system, geological data information service system, and public geological data can be deployed on the extranet accessed by the general public. The communication between the intranet and the extranet, including data exchange and audit, can be carried out via single-directional light gate.\niii) \u201cOne main node and three domain-specific nodes\u201d: One main node is constructed at the China Geological Survey Development Research Center. In addition, three domain-specific nodes \u2014 namely the marine node, geological environment node, and aviation geophysical exploration and remote sensing node \u2014 are constructed, respectively. Each node is configured with the corresponding servers, storage equipment, network equipment, management platform, large-scale specialized data processing software system, and various customized applications. Each node would store huge amounts of geological data and conform to current data security standards. The master node and the domain-specific nodes are linked via optical fibers. The master node will consist of 200 computing nodes with three\u2009petabytes of storage capacity and will be equipped with some geological data processing software system. The master node will be hosted in a medium-sized supercomputing center, and it will provide support for the three-dimensional seismic exploration data processing and other large-scale computing. The three domain-specific nodes are to maintain their scale in the near future to facilitate reasonable scheduling and efficient utilization of information resources and data resources.\nA system for geological survey business management and auxiliary decision-making is deployed on the extranet. The system provides a real-time tracking and management function for geological survey projects and various resources.\nMain users of the geological cloud include institutional users, geological survey project users, and users from the general public. The institutional users can store the current geological database and newly collected data in the geological cloud through the geological survey business network and can obtain the geological data of other institutions from the cloud as needed. The geological survey project users can access the cloud geological background data through 4G or satellite lines and can collect data through the data collection system.\n\nRequirement from big data management \nThe construction of a geological cloud must meet customer demand. Big data technologies are then used as the means to implement the geological cloud.\nThe types and quantity of geological data have been continuously growing over the years. Geological data include all kinds of electronic documents, structured, semistructured, and unstructured data, such as various databases (map database, spatial database, and attribute database), pictures, tables, video, and audio. Generally speaking, those important data may be buried in the massive dataset without the guidance for requirements. Hence, the first step is to understand the user requirements and then gain the capability of large-scale data processing. This is followed by data mining, algorithms, and analysis, which will ultimately generate value. Big data technologies in the field of geography must meet different needs from people at different levels, including the public demand of the geologic data services and professional data demand for geological research institutions, as well as related enterprises and government departments.[16]\nOn the basis of big data analysis technologies, a complete data link is formed connecting data, information, knowledge, and service, through the use of an advanced cloud computing system, IoT, and big data processing flow. It is shown in Figure 4.[5]\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 4. Schematic diagram of big data analysis\n\n\n\nChallenges for big data management in cloud-enabled geological information services \nGeological big data are generated regarding various layers of the earth, the history of the conformation and evolution of the earth, and the material composition of the earth and its changes. It also involves the exploration and utilization of mineral resources. In the current geological work, the collection, mining, processing, analysis, and utilization of various complex type data are closely related to those general big data. The \u201c4V\u201d characteristics of big data \u2014 namely volume, velocity, variety, and veracity \u2014 also apply to geological big data.\n\nVolume \nCurrently, there is no consensus on the collective size of geological data. Geological big data include geology, minerals, remote sensing, geophysical exploration, geochemical exploration, surveying, and mapping data, which are interconnected and integrated. In terms of the number of mines, there are at least 70,000 in China, and some official documents and popular science books indicate that there are more than 200,000 deposits and minerals that have been found. This collection of information is huge and cannot be processed using conventional tools. For example, an Excel spreadsheet cannot contain all the information of 70,000 mining areas. As such, it is difficult to classify and rank the 200,000 mines, so it is necessary to rely on the concepts and technologies of big data.[17]\nEspecially in recent years, images, video, and other types of data have emerged on a large scale. With the application of 3D scanning and other devices, the data volume has been increasing exponentially. The ability to describe the data is more and more powerful, and the data are gradually approximated to the real world. In addition, the large amount of data is also reflected in the aspect that the methods and ideas used by people to deal with data have undergone a fundamental change. In the early days, people used the sampling method to process and analyze data in order to approximate the objective with a small number of subsample data. With the development of technologies, the number of samples gradually approaches the overall data. Using all the data can lead to a higher accuracy, which can explain things in more detail.[18]\nRecently, the China geological survey system has built many databases, including[17]:\n\n a regional geological database (covering the 1\u2009:\u20092500000, 1\u2009:\u20091000000, 1\u2009:\u2009500000, 1\u2009:\u2009250000, and 1\u2009:\u2009200000 regional geological map; the national 1\u2009:\u2009200000 natural sand; the isotope geological dating; and the lithostratigraphic unit database);\n a basic geological database (covering the national rock property database and national geological working degree database);\n a mineral resources database (covering the national mineral resources, the national mineral resources utilization survey mining resources reserves verification results, the national survey of large and medium-sized mines, the prospect of mineral resources, the survey of the resources potential of major solid mineral resources in China, and the geological and mineral resources database);\n an oil and gas energy database (covering the oil and gas basins in China, the geological survey results of the national oil and gas resources, the national petroleum and geophysical exploration, national shale gas, national coal bed methane, national natural gas hydrate, and other databases);\n a geophysical database (covering 1\u2009:\u20091 million, 1\u2009:\u2009500000, 1\u2009:\u2009250000, 1\u2009:\u2009200000, and 1\u2009:\u200950000 gravity, national regional gravity, national aeromagnetism, national ground magnetism, national electrical survey, seismic survey, national aviation radioactivity, and national logging database);\n a geochemical database (covering the databases of national 1\u2009:\u2009250000 and 1\u2009:\u200920 geochemical exploration, national multiobjective geochemical and national land quality evaluation results);\n a remote sensing survey database (covering national aeronautical remote sensing image, China resources satellite data, space remote sensing image, national mine environmental remote sensing monitoring, national high score satellite, and other databases);\n a drilling database (covering the national geological borehole information, the national important geological borehole, the Chinese mainland scientific drilling core scanning image library, and so on);\n a hydraulic cycle hazards database;\n a data literature database;\n a special subject database (covering the national mineral resources potential evaluation database, the important mineral \u201cthree-rate\u201d investigation and evaluation database); and\n a work management database (covering the national exploration right, mining right, mining right verification, geological information metadata database, and many others).\nThose databases are still expanding and consummating, and their practical values have not yet been fully reflected. However, it's virtually impossible for the vast majority of researchers to have all of the above data, at most, using their own accumulated data. Even if their accumulated data, both on the quantity and on type, is incomparable to 10 years and 20 years ago, they are, in fact, in the era of \u201crelatively big data.\u201d From 1999 to 2004, for example, in \u201cthe Chinese mineralization system and regional metalorganic evaluation\u201d project, although there are 202 national academic experts that participated in it, they only contain master data of 4500 properties (all kinds of minerals). From 2006 to 2013, the study of \u201cnational important mineral and regional mineralization laws\u201d was conducted; meanwhile, the mining area covered only by the mineral resources research institute was 30,600. Therefore, the increase of information and the amount of data are unprecedented in the last 10 years.\n\nVariety \nFrom a formal point-of-view, geological big data have many characteristics, including multidimensionality, multiscale, and multitenses. And they contain structured, semistructured, and unstructured data and usually are stored in forms of text, graphics, images, databases (including image databases, spatial databases, and attribute databases), tables, videos, and audio, often in a fragmented state. For example, a large number of field outcrop description, borehole core description, geological survey, exploration report, geological mapping, drawing, and photo data were stored and managed in the form of paper for a long time. Even the numerous relational databases and spatial databases were primarily used to store and manage structured data that are tabulated and vectorized, while the text descriptions, records, and summaries were directly stored. Very few standardized processing and structural transformations were performed. Furthermore, there is no tool available to effectively integrate storage and manage structured, semistructured, and unstructured data.\n\nVelocity \nThe increase of geological data is very fast, especially in remote sensing geology, aviation geophysical exploration, regional geochemical exploration, and other fields, due to the introduction of new technologies and new methods. Meanwhile, high-speed processing is also a characteristic of big data. In addition to the need of analyzing data in real time, people also need to describe the results of data mining and processing through the use of several data processing techniques, such as image and video, while requiring effective and efficient handling skills. For example, the detection of deep earth information not only needs to obtain parameters of the seismic wave reflection and refraction but also needs to conduct quick processing, so as to timely predict whether earthquakes will occur and forecast the time, location, and intensity. In this way, we can avoid the disaster effectively. When applying a variety of data to a particular mountain, one should learn which ones have spatial limitations and which are not related to spatiality, so that one deduces the metallogenic law and guides the prospecting better.[18]\n\nVeracity \nFor the understanding of the value of big data, most people consider it low-value density. It means that the real useful information in the vast amount of data is very little. Taking video as an example, the useful data may be only a second or two in the continuous monitoring process. While big data is high-value, it does not need to be too invested; just collecting information from the internet can bring business value. Therefore, big data has the characteristics of low-value density and high business value. The same is true for geological big data. So far, there has been a lot of information about geophysical prospecting, but only a few have been confirmed, and the discovered mines were less. But once a breakthrough was made, its socioeconomic value was enormous, such as the lithium polymetallic deposit in Tibet and the newly discovered Jima copper polymetallic deposit in the outskirts of Sichuan.[18]\nIn addition, the spatial attribute and temporal attribute of geological data also bring a big challenge to data accuracy. Any geological data have spatial attributes, and their values are reflected in the spatial law of distribution of mineral resources. For this reason, in the process of establishing the metallogenic series, exploring the metallogenic law, and constructing the mathematical model, the spatial attribute of the metallogenic model should be considered. Obviously, every metallogenic series has its own spatial attributes. Geological data also has the time attribute, which is very different from physical, chemical, and other natural sciences. One of the fundamental pillars of geology is the geological time scale. The rocks, strata, and deposits of different geological periods have different distribution characteristics and regularity, so those data have their own time attribute.\nIt is obvious that those characteristics of geological big data mentioned above impose very challenging obstacles to the data management in CEGIS. The challenges related to geological big data management can be summarized as follows:\n(i) It is quite difficult to describe and model geological big data since there are few effective description mechanisms for characteristics and object modeling approaches under the cloud computing environment.\n(ii) There remain many technical issues that must be addressed to fully manage, mine, analyze, integrate, and share those geological big data, in consideration of those complex characteristics, including multi-source heterogeneous data, highly spatiotemporal variation, high-volume and high-correlation data, and many others.\n(iii) Many issues appear in achieving decision support, such as data incompleteness, data uncertainty, and high-dimensionality of data.\nThe broad range of challenges described here make good topics for research within the field of big data management in CEGIS. They are analyzed in the next section.\n\nKey technologies and trends on big data management in cloud-enabled geological information services (CEGIS) \nWith the rapid advancement of big data technologies, some key technologies are accordingly developed for big data management in CEGIS. Specifically, a schematic diagram of those key technologies is shown in Figure 5. In this section we present an analysis on those key technologies, along with discussion of key related trends.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 5. Schematic diagram of key technologies for big data management in CEGIS\n\n\n\nGeological big data collection and preprocessing \nGeological big data collection and preprocessing aim to categorize those geological big data obtained from geological data, geological information, and geological literature.\n\nGeological data collection access \nIn addition to the traditional collection ways, it is also required to carry out large-scale network information access and provide real-time, high concurrency, and fast web content acquisition, combining with the application characteristics in the cloud environment. Currently, considering that the growth rate of geological information generated from the network is very fast, the big data analysis system should obtain relevant data quickly.\n\nQuality and usability characteristics of geological data \nIt needs to distinguish and identify valuable information through intelligent discovery and management technologies. Because the information value density contained in different data sources differs from each other, filtering out the useless or low-value data source can effectively reduce the data storage and processing costs. Then, it can also further improve the efficiency and accuracy of analysis.\n\nGeological data entity recognition model \nAccording to the subject domain of geology, the distributed data are extracted to form a data warehouse, after conducting the operation of processing and integration. When extracting data in the field of geology, it needs to use an entity modeling method to abstract entities from the vast number of data, so as to find out the relationship between those entities. This approach ensures that the data used in warehouse data can be consistent and relevant in accordance with the data model.[19] These recognized data are directly input into the system and stored as metadata, which could be used for data management and analysis.\n\nAggregation of geological big data \nGenerally, different data sources and even the same data source may generate data with different formats. As mentioned above, because these structural, semistructured, and unstructured multimodal geological big data are integrated together, the data heterogeneity is obvious in big data analysis. Data aggregation, as the key technology in achieving data extraction and transformation[20], enables data sharing and data fusion between heterogeneous data sources. Through the use of heterogeneous information aggregation technologies, unified data retrieval and data presentation could be achieved. After aggregating those distributed heterogeneous data sources, they are extracted and converted to achieve the functions of automatically constructing a subject domain database and data warehouse.[21]\n\nManagement of geological big data evolution tracking records \nIn order to effectively utilize geological big data, it needs to track the evolution of big data during the whole life cycle of GIS, with the purpose of achieving the traceable big data management.\nHere, we provide an example of aggregating and collecting geological big data in CEGIS; Figure 6 illustrates this process. While developing CEGIS, all kinds of geological data should be processed. Through the use of the geological cloud, big data are collected, and then they are aggregated to achieve some key functions in the geological information service platform, including catalog sharing, intelligent searching, data products release, and collaborative service.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 6. An example of aggregating and collecting geological big data in CEGIS\n\n\n\nGeological big data storage and management \nFrom the data collection perspective, geological data can be divided into field survey data, drilling and engineering exploration data, remote detection data, analytical test data, and comprehensive study data. From the angle of comprehensive application fields, they can be also divided into regional geological survey data, energy and mineral resources evaluation and exploration survey results data, geological disaster monitoring and early warning data, geological environment survey and evaluation results data, and marine geological survey and evaluation data. From the data formality point of view, they can be divided into picture data, text report data, tabular data, and image data. These data are collected by different units.\nFacing these complex geological big data mentioned above, the traditional relational database will find it difficult to handle them, while the distributed storage system can be used more effectively to store such huge amounts of data and manage them. The data system places the massive data in many machines, which avoids such limitations of storage capacity, though it also brings many problems that have not occurred before in stand-alone systems. Hence, some distributed data storage solutions have accordingly emerged, including Hadoop, Spark, and other nonrelational database systems (like HBase, MongoDB, and many others).[22] These different solutions satisfy the specific requirements from different applications. When applying to the analysis of big data, different solutions can be employed according to the specific needs of different intelligence analysis. Furthermore, different solutions can be combined to meet specific needs. Actually, there have been some attempts to develop combination strategies for distributed storage models, varying in the big data management performance requirement, and the complexity of collected big data that are supported by the distributed storage system.[23] Hence, there is still a room for improvement and optimization of geological big data storage, while designing a hybrid distributed storage model through the use of cloud's advantages of flexibly scalable deployment, to meet the users\u2019 requirement for geological big data resource management with satisfactory data durability and high availability.[23]\nHere, the hot research topics include the following:\n(i) For geological applications, load optimization storage should be implemented to achieve the coupling of data storage and application, as well as the coupling of distributed file system and the new storage system.\n(ii) Based on the application characteristics of distributed databases, more studies could be conducted on the application of new databases such as NoSQL and NewSQL in geological survey work.\nWith the development of big data technologies, more and more mature distributed data storage solutions will emerge and will be applied to big data analysis.[24][25]\nSpecifically, in the management of geological big data, the implementation of data query \u2014 for example, spatial query \u2014 has been a long-term focus. Generally, considering those advantages with unified modeling language (UML) and computer-aided software engineering (CASE) methodology, the spatial database could be accordingly designed and implemented to characterize and realize the object-oriented spatial vector big data firstly.[26] And then, in the developed spatial database, the function of self-generating codes would be achieved to realize two-way spatial query between graphic-objects and property data.[26] Moreover, in consideration of the complex characteristics of geological big data, the spatial query is achieved finally through the use of Flex technology in the ArcGIS Server software platform.[27] Practically speaking, with this technology, the spatial query could be implemented through two functions, including \u201cQuery\u201d and \u201cFind\u201d query methods.[28]\n\nGeological big data analysis and mining \nIn terms of geological data analysis and mining, it needs to combine geological data, geological information, and geological literature, through the analysis of geological application demand of real-time mining, to explore geological big data environment analysis and mining algorithms, in an effort to fully achieve the goal of intelligent mining for geological big data.\nFigure 7 shows a schematic diagram of discovering geological knowledge through analyzing and mining geological big data. It can be readily seen that geological big data analysis and mining play an important role in achieving the final goal. \n\r\n\n\n\n\n\n\n\n\n\n\n Figure 7. Schematic diagram of discovering geological knowledge through analyzing and mining geological big data\n\n\n\nMore relevant research work related to it mainly involves the following aspects.\n\nGeological big data analysis \nConsidering the special applications, geological big data technologies would apply big data concepts to analyze the metallogenic rules by making full use of various data related to ore, to recognize deposit metallogenic series, to summarize the metallogenic regularities and express in an appropriate way (like voice, image, and many others), and to establish the scientifically mathematical model. The model then uses new exploration data to predict future data and to guide geological prospecting.\nIn addition, it is necessary to pay special attention to the analysis of new geological big data information collected from social medium and networks.[29] These include the geological text information flow data from microblog web sites, the geological multimedia data from media sharing web sites, the geology-related user interaction data on social networking web sites, and many others.[30] These multisource data complement traditional big data. Specifically, such data should be addressed with the help of multilingual information processing, multilingual machine translation, and social network cross-language retrieval.[31] Big data analysis of such data is a key to deep use of geological data in a broader dimension. With the maturity of big data analysis technologies, it becomes possible to analyze and extract valuable information from these data[32] and to provide effective solutions for geological big data applications.\n\nGeological big data mining \nData mining involves the extraction of unknown and useful knowledge and information from the massive multilevel spatiotemporal data and attribute data, using statistics, pattern recognition, artificial intelligence, set theory, fuzzy mathematics, cloud computing, machine learning, visualization, and relevant techniques and methods. Data mining could reveal the relationship and evolution trend behind the geological big data, achieve the automatic or semiautomatic acquisition of the new knowledge, and provide the decision basis for resource prediction, prospecting, environmental assessment, and disaster prevention and mitigation.[33] The knowledge is obtained directly from known geological data to provide relevant decision support.[34] In consideration of the amount of data, it may deal with terabytes or even petabytes of data, as well as multidimensional, noisy, and dynamic data. Because data mining algorithms will directly influence the outcome of the discovered knowledge, selecting the most appropriate algorithms and parallel computing strategy is the key to data mining.\nEffective data mining also could reduce manual intervention during information processing and make use of methods and tools of big data intelligent analysis.[35][36] Recently, there has been a growing interest in geological big data mining through the use of some novel computational intelligent methods such as rough set[37] and fuzzy aggregation.[38] Moreover, with the development of those neural-network-based machine learning algorithms in recent years, popular methods such as extreme learning machine[39][40], approximate dynamic programming[41], and kernel learning[42] could be used to further improve mining effectiveness for geological big data in the future.\n\nHighly performable big data cloud computing platform \nA highly performable big data cloud computing platform is the foundation for big data analysis. It enables parallel computing for large-scale incremental real-time data and large-scale heterogeneous data.[43][44][45][46]\nWith the advent of massive data storage solutions, many big data distributed computing frameworks have been proposed. Among them, Hadoop, MapReduce, Spark, and Storm are the most important distributed computing frameworks. These frameworks have different characteristics and solve different problems in applications.[47][48][49][50] Hadoop\/MapReduce is often used for offline complex big data processing, Spark is often employed in offline fast big data processing, and Storm is often available for real-time online big data processing. Different computing frameworks have their different advantages and disadvantages. Hadoop\/MapReduce is easy to program, and it has satisfactory scalability and fault tolerance. In addition, it is suitable for offline processing of massive data at the petabyte level, but it does not support real-time computation and flow calculation. Spark is a memory-based iterative computing framework. By placing intermediate data in memory, Spark can achieve higher iterative calculation performance. The programming model of Spark is more flexible than that of Hadoop\/MapReduce, but Spark is not suitable for those applications in which the fine-grained updates are conducted asynchronously. Hence, Spark may be unavailable for those application models that require incremental changes. Storm is suitable for stream data processing. It can be used to handle a stream of incoming messages and can write the processed results to a specified storage device. Another major application of Storm is real-time data processing where data are not necessary to be written into storage devices, which usually results in little time delay. Hence, Storm is particularly suitable for scenarios where real-time online analysis is required to obtain results for big data analysis.\nAn application example is the geological big data aggregation mining framework based on Hadoop.[16] Geological big data aggregation mining platform research is based on the China geological survey data network, and it uses the Hadoop technology to improve and modify the existing platform, to make it suitable for big data applications, and to provide a platform for the pilot applications. The geological survey grid platform can be updated in three layers: the virtual layer, the computing layer, and the terminal application layer. The virtual layer represents the virtualization of computer resources based on Hadoop distributed file system (HDFS) virtualization technology, which is the foundation of cloud computing and cloud services. The computing layer mainly uses the MapReduce method to implement the analysis algorithms for geological big data. Currently, the geological big data technologies mainly use the block calculation strategy to achieve parallel analysis through the utilization of the characteristics of Hadoop, in an effort to speed up the analysis and processing of geological data. The terminal application layer is designed to display the results and receive user feedback to improve system availability.\nMapReduce has been used to perform morphological correlation analysis, which involves the analysis of geochemical data processing and the study of the correlation between multielements. Figure 8 shows the pattern correlation between elements. It can be seen from Figure 8 that the elements of Mn, Co, and Be are similar in the distribution of morphology. Therefore, from a qualitative point-of-view, the correlation is relatively high. Moreover, after testing, the proposed prototype system is running three times more quickly than the existing common computing platform, showing that the geological big data is applicable to the Hadoop platform. (More applications of using MapReduce can be found in Giachetta's 2015 paper.[51])\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 8. Correlation among three element morphologies\n\n\n\nApplications of geological big data technologies \nExploration of metallogenic law \nThe metallogenic law is the human regular knowledge of the temporal and spatial distribution of mineral resources, and its cognitive level, ability, and scope are all related to the size of data, the type of data, and the way of data processing. Therefore, to deduce the metallogenic law, it is necessary to fully understand the massive data surrounding spatial distribution, reserves and production in mineral origin, the geological structure of the mineral origin, and related geological survey data. As such, it's important to conduct the regular speculation and objectivity expression of these geological big data so that one can identify the essential reasons for the distribution of mineral origin. Using geological big data technologies could help to translate data into new understanding or knowledge and help to guide the future of geological prospecting work.\n\nSmart prospecting \nDeposit types vary, and their formation is related to certain geological backgrounds and geological effects, respectively. The geological backgrounds include tectonic unit and stratigraphic unit, deep upper mantle and lithosphere conditions, and paleogeography and palaeoclimate environment on the surface of the earth. Geological effects include tectonism, magmatism, sedimentation, metamorphism, and weathering. These geologic backgrounds and effects, in the wide range of space and in the long geologic history, are a dynamic change and repeated stack, and large deposits can be formed only in a variety of favorable conditions. Long-term scientific research and experience is required to make sense of the accumulation of formed mineral deposits and associated mineralization predictions. Professionals are guided by certain theories and methods to adopt quantitative or qualitative prospecting methods to predict with the existing knowledge and experience.\nHowever, in view of the difficulties of geological data sharing and the limitations of calculation tools and calculation methods, most of the known deposits in the past are independent of each other. In the future, we can use geological data to connect the exploration data of several adjacent deposits, conduct unified analysis and specialized processing, determine the \u201cdigital\u201d characteristics of the distribution of metallogenic materials, determine metallogenic potential, delineate the abnormal area and prospective area, and promote geological prospecting. Furthermore, geological data informatization and standardization could be improved.[52]\n\nService of people\u2019s livelihood geology \nSince entering the twenty-first century, geological work has become mroe closely related to economic development, and geological work plays an important role in every aspect of social and economic life. Agricultural geology, urban geology, environmental geology, tourism geology, disaster geology, and other works have been strengthened, and the service area has also been expanded.[53] At the same time, the public demand for geological information is increasingly urgent.[54]\nIn order to meet the social demand for geological data, the China Geological Survey carried out the construction of a geological cloud, which built a cluster geologic data service system with the National Geological Information Center and the Provincial Geological Information Center as the backbone nodes, conducted the integration of data resources, and applied the GIS cloud technology, all in order to obtain large-scale computing ability and solve key problems such as the distributed storage, processing, query, interoperability, and virtualization of massive spatial data.[5][13] Recently, in China, Shandong Provincial Bureau of Geology and Mineral Resources also carried out the construction of \u201cthe application system of geological business based on e-government cloud platform.\u201d It mainly relies on the public service cloud platform of the e-government in Shandong province and constructs the government external network service system and internet service system to achieve the unified management and information service of the mineral resource. Using technical methods of spatial analysis, big data mining, and three-dimensional geological modelling, it develops a basic systems framework for geological mining services, featuring \u201ca (cloud) platform, a (data) center, and many application systems\u201d to improve the ability of the people\u2019s geological service, promote interaction with the public, realize socialization services, and promote the clustering and industrialization of the mineral resources information services.\n\nApplication of knowledge visualization service \nWith the continuous development of web technology, human beings have experienced the \u201cWeb 1.0\u201d era, which was characterized by document interconnection, and \u201cWeb 2.0\u201d, which was characterized by data interconnection, and are moving towards the new \u201cWeb 3.0\u201d era based on the interconnected knowledge of the entity. Due to the continuous release of user-generated content and linking open data on the internet, people need to explore knowledge interconnection methods which both conform to the development of the network information resources and meet users\u2019 requirements from a new perspective according to the knowledge organization principles in the large data environment, revealing human cognition on a deeper level.[55]\nIn this context, Knowledge Graph (KG) was formally put forward by Google in May 2012, and its goal was to improve the search results and describe the various entities and concepts that exist in the real world and the relationship between these entities and concepts. KG is a great choice to select the essence and discard the dross, as well as the sublimation of the present semantic web technology. In recent years, the applications of KG have been increasing rapidly, and there is now a mature method used to draw a KG and conduct intelligent academic research based on KG.[34] However, the function of KG has not been fully implemented at present, especially for geological big data; the application aspect still needs to be further strengthened. Along this direction, the visualization service for geological data in the web-based system is attracting more and more attention.[56][57]\n\nConclusion \nBig data technologies make it possible to process massive amounts of unstructured and semistructured geological data. And the geological cloud enables us to explore the application of demand-driven geological core data and to extract new information from unstructured data, while supporting the decision-making in land resources management. Thus, the geological cloud could effectively organize and use geological big data, to mine the data scientifically, with the purpose of producing higher value and achieving the corresponding service.\nIn the architecture of the geological cloud, this article describes the application background of CEGIS and the demands from big data management. Furthermore, we elaborate on the application requirements and challenges faced in big data management technologies. Then, more analyses are provided from four aspects, including data size, data type, data processing speed, and data processing accuracy, respectively. This article also outlines the research status and technology development opportunities of big data related in CEGIS, from the perspectives of big data acquisition and preprocessing, big data storage and management, big data analysis and mining, highly performable big data cloud computing platform, and big data technology applications. With the continuous development of big data technologies in addressing those challenges related to geological big data, such as the difficulties of describing and modeling geological big data with some complex characteristics, CEGIS will move towards a more mature and more intelligent direction in the future.\n\nConflicts of interest \nThe authors declare that there are no conflicts of interest regarding the publication of this article.\n\nAcknowledgments \nThis work was supported in part by the Key Laboratory of Geological Information Technology of Ministry of Land and Resources under Grant 2017320, the National Key Technologies R&D Program of China under Grant 2015BAK38B01, and the National Key R&D Program of China under Grant 2016YFC0600510.\n\nReferences \n\n\n\u2191 Vermeesch, P.; Garzenti, E. (2015). \"Making geological sense of \u2018Big Data\u2019 in sedimentary provenance analysis\". Chemical Geology 409: 20-27. doi:10.1016\/j.chemgeo.2015.05.004.   \n\n\u2191 2.0 2.1 Chen, J.; Xiang, J.; Hu, Q. et al. (2016). \"Quantitative Geoscience and Geological Big Data Development: A Review\". Acta Geologica Sinica 90 (4): 1490\u20131515. doi:10.1111\/1755-6724.12782.   \n\n\u2191 3.0 3.1 Zhu, Y.; Tan, Y.; Li, R. et al. (2016). \"Cyber-physical-social-thinking modeling and computing for geological information service system\". International Journal of Distributed Sensor Networks 12 (11). doi:10.1177\/1550147716666666.   \n\n\u2191 4.0 4.1 Kim, Y.-H.; Yarlagadda, P. (2013). \"Cloud Computing Model for Big Geological Data Processing\". Applied Mechanics and Materials 475\u2013476: 306-311. doi:10.4028\/www.scientific.net\/AMM.475-476.306.   \n\n\u2191 5.0 5.1 5.2 5.3 5.4 Chen, J.; Li, J.; Cui, N.; Yu, P. (2015). \"The construction and application of geological cloud under the big data background\". Geological Bulletin of China 34 (7): 1260\u20131265. http:\/\/caod.oriprobe.com\/articles\/46629977\/The_construction_and_application_of_geological_cloud_under_the_big_dat.htm .   \n\n\u2191 Li, C. (2010). [10.1109\/GEOINFORMATICS.2010.5567743 \"The technical infrastructure of geological survey information grid\"]. Proceedings from the 18th International Conference on Geoinformatics 2010: 1\u20136. 10.1109\/GEOINFORMATICS.2010.5567743 .   \n\n\u2191 Wu, L.; Xue, L.; Li, C. et al. (2015). \"A Geospatial Information Grid Framework for Geological Survey\". PLoS One 10 (12): e0145312. doi:10.1371\/journal.pone.0145312.   \n\n\u2191 Evangelidis, K.; Ntouros, K.; Makridis, S.; et al. (2014). \"Geospatial services in the Cloud\". Computers & Geosciences 63: 116\u2013122. doi:10.1016\/j.cageo.2013.10.007.   \n\n\u2191 Huang, M.; Liu, A.; Wang, T.; Huang, C. (2017). \"Green data gathering under delay differentiated services constraint for internet of things\". Wireless Communications and Mobile Computing. https:\/\/www.hindawi.com\/journals\/wcmc\/aip\/9715428\/ .   \n\n\u2191 \"Web of Science\". Clarivate Analytics. https:\/\/www.webofknowledge.com\/ .   \n\n\u2191 11.0 11.1 Yang, C.; Yu, M.; Hu, F. et al. (2017). \"Utilizing cloud computing to address big geospatial data challenges\". Computers, Environment and Urban Systems 61 (Part B): 120\u2013128. doi:10.1016\/j.compenvurbsys.2016.10.010.   \n\n\u2191 Wu, L.; Xue, L.; Li, C. et al. (2017). \"A Knowledge-Driven Geospatially Enabled Framework for Geological Big Data\". International Journal of Geo-Information 6 (6): 166. doi:10.3390\/ijgi6060166.   \n\n\u2191 13.0 13.1 Tan, Y. (2016). \"Architecture and Key Issues of Geological Big Data and Information Service Project\". Geomatics World 23 (1): 1\u20136. http:\/\/caod.oriprobe.com\/articles\/48928882\/Architecture_and_Key_Issues_of_Geological_Big_Data_and_Information_Ser.htm .   \n\n\u2191 14.0 14.1 Tan, Y. (2016). \"Architecture investigation of the construction of geological big data system\". Geological Survey of China 3 (3): 1\u20136. http:\/\/www.zgdzdcbjb.com\/EN\/abstract\/abstract160.shtml .   \n\n\u2191 He, W.; Wang, Y. (2014). \"Prototype system of geological cloud computing\". Progress in Geophysics 29 (6): 2886\u20132896. http:\/\/caod.oriprobe.com\/articles\/45636829\/Prototype_system_of_geological_cloud_computing.htm .   \n\n\u2191 16.0 16.1 Zhu, Y.; Tan, T.; Zhang, J. et al. (2015). \"A framework of hadoop based geology big data fusion and mining technologies\". Cehui Xuebao\/Acta Geodaetica et Cartographica Sinica 44 (S0): 152\u2013159. doi:10.11947\/j.AGCS.2015.F059.   \n\n\u2191 17.0 17.1 Wang, D.; Liu, X.; Liu, L. (2015). \"Characteristics of big geodata and its application to study of minerogenetic regularity and minerogenetic series\". Mineral Deposits 34 (6): 1143\u20131154. doi:10.16111\/j.0258-7106.2015.06.004.   \n\n\u2191 18.0 18.1 18.2 Pan, B.; Yang, R. (2017). \"Management and Utilization of Big Data for Geology\". Surveying and Mapping of Geology and Mineral Resources 33 (1): 1\u20133, 14. https:\/\/caod.oriprobe.com\/articles\/50925192\/Management_and_Utilization_of_Big_Data_for_Geology.htm .   \n\n\u2191 Yang, P.; Lu, L.J. (2014). \"The Research on Encoding Methodology of the Character of Geological Entity Based on Mass Geological Data\". Advanced Materials Research 962-965: 208\u2013212. doi:10.4028\/www.scientific.net\/AMR.962-965.208.   \n\n\u2191 Luo, X.; Zhang, D.; Yang, L.T. et al. (2016). \"A kernel machine-based secure data sensing and fusion scheme in wireless sensor networks for the cyber-physical systems\". Future Generation Computer Systems 61: 85\u201396. doi:10.1016\/j.future.2015.10.022.   \n\n\u2191 Kuo, C.-L.; Hong, J.-H. (2016). \"Interoperable cross-domain semantic and geospatial framework for automatic change detection\". Computers & Geosciences 86: 109\u2013119. doi:10.1016\/j.cageo.2015.10.011.   \n\n\u2191 Wang, Y.-J.; Sun, W.-D.; Zhou, S. et al. (2011). \"Key Technologies of Distributed Storage for Cloud Computing\". Journal of Software 23 (4): 962-986. doi:10.3724\/SP.J.1001.2012.04175.   \n\n\u2191 23.0 23.1 Armbrust, M.; Fox, A.; Griffith, R. et al. (10 February 2009). \"Above the Clouds: A Berkeley View of Cloud Computing\" (PDF). University of California at Berkeley. https:\/\/www2.eecs.berkeley.edu\/Pubs\/TechRpts\/2009\/EECS-2009-28.pdf .   \n\n\u2191 Xia, J.; Bai, Z.; Wang, B. et al. (2014). \"Design and Implementation of Comprehensive Management Platform for Geological Data Informatization\". Acta Scientiarum Naturalium Universitatis Pekinensis 50 (2): 295-300. http:\/\/xbna.pku.edu.cn\/EN\/abstract\/abstract2677.shtml .   \n\n\u2191 Hua, W.; Liu, J.; Liu, X. (2015). \"Data Management of Object Type Geological Features on Control Dictionary\". Earth Science - Journal of China University of Geosciences 40 (3): 425\u2013430. http:\/\/en.cnki.com.cn\/Article_en\/CJFDTotal-DQKX201503004.htm .   \n\n\u2191 26.0 26.1 Jia, B.; Wang, C.; Liu, C. et al. (2010). \"Design and implementation of object-oriented spatial database of coalfield geological hazards-based on object-oriented data model\". Proceedings from the 2010 International Conference on Computer Application and System Modeling 2010: V1282\u2013V1286. doi:10.1109\/ICCASM.2010.5619411.   \n\n\u2191 \"ArcGIS Enterprise\". Environmental Systems Research Institute, Inc. http:\/\/enterprise.arcgis.com\/en\/ .   \n\n\u2191 Zhou, X.; Li, X.; Chen, A. et al. (2013). \"Design and Implementation of the Service System of Spatial Data for Geological Data\". Journal of Geomatics 38 (4): 57\u201360. http:\/\/en.cnki.com.cn\/Article_en\/CJFDTOTAL-CHXG201304020.htm .   \n\n\u2191 Huang, H.; Cao, Z.; Feng, C. (2016). \"Opportunities and challenges of big data intelligence analysis\". CAAI Transactions on Intelligent Systems 11 (6): 719-727. doi:10.11992\/tis.201610025.   \n\n\u2191 Jin, S.; Lin, W.; Yin, H. et al. (2015). \"Community structure mining in big data social media networks with MapReduce\". Cluster Computing 18 (3): 999\u20131010. doi:10.1007\/s10586-015-0452-x.   \n\n\u2191 Yang, C.C.; Wei, C.-P.; Chien, L.-F. (2011). \"Managing and mining multilingual documents: Introduction to the special topic issue of information processing management\". Information Processing & Management 47 (5): 633-634. doi:10.1016\/j.ipm.2010.02.002.   \n\n\u2191 Luo, X.; Deng, J.; Wang, W. et al. (2017). \"A Quantized Kernel Learning Algorithm Using a Minimum Kernel Risk-Sensitive Loss Criterion and Bilateral Gradient Technique\". Entropy 19 (7): 365. doi:10.3390\/e19070365.   \n\n\u2191 Tse, C.H.; Li, Y.; Lam, E.Y. (2015). \"Geological applications of machine learning on hyperspectral remote sensing data\". Proceedings Volume 9405: SPIE\/IS&T Electronic Imaging 2015 9405 (2015). doi:10.1117\/12.2178400.   \n\n\u2191 34.0 34.1 Zhu, Y.; Zhou, W.; Xu, Y. et al. (2017). \"Intelligent Learning for Knowledge Graph towards Geological Data\". Scientific Programming 2017 (2017). doi:10.1155\/2017\/5072427.   \n\n\u2191 Gasmi, A.; Gomez, C.; Zouai, H. et al. (2015). \"PCA and SVM as geo-computational methods for geological mapping in the southern of Tunisia, using ASTER remote sensing data set\". Arabian Journal of Geosciences 19 (4): 747\u2013767. doi:10.1007\/s10596-015-9483-x.   \n\n\u2191 Vo, H.X.; Durlofsky, L.J. (2016). \"Data assimilation and uncertainty assessment for complex geological models using a new PCA-based parameterization\". Computational Geosciences 9: 753. doi:10.1007\/s12517-016-2791-1.   \n\n\u2191 Luo, Z.S.; Wei, Y.T. (2012). \"Research on Rough Set Applied in the Geological Measure Data Prediction Model\". Advanced Materials Research 457-458: 792-798. doi:10.4028\/www.scientific.net\/AMR.457-458.792.   \n\n\u2191 Farzamian, M.; Rouhani, A.K.; Yarmohammadi, A. et al. (2016). \"A weighted fuzzy aggregation GIS model in the integration of geophysical data with geochemical and geological data for Pb\u2013Zn exploration in Takab area, NW Iran\". Arabian Journal of Geosciences 9: 104. doi:10.1007\/s12517-015-2202-z.   \n\n\u2191 Xu, Y.; Luo, X.; Wang, W. et al. (2017). \"Efficient DV-HOP Localization for Wireless Cyber-Physical Social Sensing System: A Correntropy-Based Neural Network Learning Scheme\". Sensors 17 (1): E135. doi:10.3390\/s17010135. PMC PMC5298708. PMID 28085084. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5298708 .   \n\n\u2191 Luo, X.; Xu, Y.; Wang, W. et al. (2017). \"Towards enhancing stacked extreme learning machine with sparse autoencoder by correntropy\". Journal of the Franklin Institute. doi:10.1016\/j.jfranklin.2017.08.014.   \n\n\u2191 Luo, X.; Luo, H.; Chang, X. (2015). \"Online Optimization of Collaborative Web Service QoS Prediction Based on Approximate Dynamic Programming\". International Journal of Distributed Sensor Networks 11 (8): 452492. doi:10.1155\/2015\/452492.   \n\n\u2191 Luo, X.; Deng, J.; Liu, J. et al. (2017). \"A Quantized Kernel Least Mean Square Scheme with Entropy-Guided Learning for Intelligent Data Analysis\". China Communications 14 (7): 127\u2013136. http:\/\/www.cic-chinacommunications.cn\/CN\/Y2017\/V14\/I7\/127 .   \n\n\u2191 Passmore, J.; Laxton. J.; Sen, M. (2014). \"EarthServer for Geological applications \u2013 Opening up access to big data using OGC web services\". In Toll, D.G.; Zhu, H.; Osman, A. et al.. Information Technology in Geo-Engineering. Advances in Soil Mechanics and Geotechnical Engineering. 3. IOS Press. pp. 123\u2013129. doi:10.3233\/978-1-61499-417-6-123. ISBN 9781614994176.   \n\n\u2191 Li, C.; Song, M.; Lv, X. et al. (2010). \"The Spatial Data Sharing Mechanisms of Geological Survey Information Grid in P2P Mixed Network Systems Network Architecture Model\". Proceedings from the 9th International Conference on Grid and Cooperative Computing 2010: 258\u2013263. doi:10.1109\/GCC.2010.59.   \n\n\u2191 Cruz, S.A.B.; Monteiro, A.M.V.; Santos, R (2012). \"Automated geospatial Web Services composition based on geodata quality requirements\". Computers & Geosciences 47: 60\u201374. doi:10.1016\/j.cageo.2011.11.020.   \n\n\u2191 Xia, J.; Yang, C.; Liu, K. et al. (2015). \"Forming a global monitoring mechanism and a spatiotemporal performance model for geospatial services\". International Journal of Geographical Information Science 29 (3): 375-396. doi:10.1080\/13658816.2014.968783.   \n\n\u2191 Ibrahim, S.; Jin, H.; Lu, L. et al. (2009). \"Evaluating MapReduce on Virtual Machines: The Hadoop Case\". IEEE International Conference on Cloud Computing 2009: 519\u2013528. doi:10.1007\/978-3-642-10665-1_47.   \n\n\u2191 Iqbal, M.H.; Soomro, T.R. (2015). \"Big Data Analysis: Apache Storm Perspective\". International Journal of Computer Trends and Technology 19 (1): 9-14. doi:10.14445\/22312803\/IJCTT-V19P103.   \n\n\u2191 Reyes-Ortiz, J.L.; Oneto, L.; Anguita, D. (2015). \"Big Data Analytics in the Cloud: Spark on Hadoop vs MPI\/OpenMP on Beowulf\". Procedia Computer Science 53: 121\u2013130. doi:10.1016\/j.procs.2015.07.286.   \n\n\u2191 Meng, X.; Bradley, J.; Yavuz, B. et al. (2016). \"MLlib: Machine Learning in Apache Spark\". Journal of Machine Learning Research 17 (34): 1\u20137. http:\/\/jmlr.org\/papers\/v17\/15-237.html .   \n\n\u2191 Giachetta, R. (2015). \"A framework for processing large scale geospatial and remote sensing data in MapReduce environment\". Computers & Graphics 49: 37\u201346. doi:10.1016\/j.cag.2015.03.003.   \n\n\u2191 Huang, S.; Lui, X. (2016). \"Geological Data Informatization and Standardization Based on Geological Big Data\". Coal Geology of China 28 (7): 74\u201378. doi:10.3969\/j.issn.1674-1803.2016.07.17.   \n\n\u2191 Kouame, K.J.A.; Jiang, F.; Feng, T.; Zhu, S. (2017). \"The Strengthening of Geological Infrastructure, Research and Data Acquisition - Using Gis in Ivory Coast Gold Mines\". MATEC Web of Conferences 95: 18001. doi:10.1051\/matecconf\/20179518001.   \n\n\u2191 Karlsson, C.S.J.; Miliutenko, S.; Bj\u00f6rklund, A. et al. (2017). \"Life cycle assessment in road infrastructure planning using spatial geological data\". International Journal of Life Cycle Assessment 22 (8): 1302\u20131317. doi:10.1007\/s11367-016-1241-3.   \n\n\u2191 Stock, K.; Stojanovic, T.; Reitsma, F. et al. (2012). \"To ontologise or not to ontologise: An information model for a geospatial knowledge infrastructure\". Computers & Geosciences 45: 98-108. doi:10.1016\/j.cageo.2011.10.021.   \n\n\u2191 Hunter, J.; Brooking, C.; Reading, L.; Vink, S. (2016). \"A Web-based system enabling the integration, analysis, and 3D sub-surface visualization of groundwater monitoring data and geological models\". International Journal of Digital Earth 9: 197-214. doi:10.1080\/17538947.2014.1002866.   \n\n\u2191 M\u00fcller, R.D.; Qin, X.; Sandwell, D.T. et al. (2016). \"The GPlates Portal: Cloud-Based Interactive 3D Visualization of Global Geophysical and Geological Data in a Web Browser\". PLoS One 11 (3): e0150883. doi:10.1371\/journal.pone.0150883. PMC PMC4784813. PMID 26960151. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4784813 .   \n\n\nNotes \nThis presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added. Grammar has been updated to make the content more readable.\n\n\n\n\n\n\nSource: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Big_data_management_for_cloud-enabled_geological_information_services\">https:\/\/www.limswiki.org\/index.php\/Journal:Big_data_management_for_cloud-enabled_geological_information_services<\/a>\n\t\t\t\t\tCategories: LIMSwiki journal articles (added in 2018)LIMSwiki journal articles (all)LIMSwiki journal articles on geoinformaticsLIMSwiki journal articles on software\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\n\t\t\n\t\t\tNavigation menu\n\t\t\t\t\t\n\t\t\tViews\n\n\t\t\t\n\t\t\t\t\n\t\t\t\tJournal\n\t\t\t\tDiscussion\n\t\t\t\tView source\n\t\t\t\tHistory\n\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\t\n\t\t\t\tPersonal tools\n\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\t\t\tLog in\n\t\t\t\t\t\t\t\t\t\t\t\t\tRequest account\n\t\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\t\n\t\tNavigation\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tMain page\n\t\t\t\t\t\t\t\t\t\t\tRecent changes\n\t\t\t\t\t\t\t\t\t\t\tRandom page\n\t\t\t\t\t\t\t\t\t\t\tHelp\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tSearch\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t \n\t\t\t\t\t\t\n\t\t\t\t\n\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tTools\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tWhat links here\n\t\t\t\t\t\t\t\t\t\t\tRelated changes\n\t\t\t\t\t\t\t\t\t\t\tSpecial pages\n\t\t\t\t\t\t\t\t\t\t\tPermanent link\n\t\t\t\t\t\t\t\t\t\t\tPage information\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\tPrint\/export\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tCreate a book\n\t\t\t\t\t\t\t\t\t\t\tDownload as PDF\n\t\t\t\t\t\t\t\t\t\t\tDownload as Plain text\n\t\t\t\t\t\t\t\t\t\t\tPrintable version\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\n\t\tSponsors\n\t\t\n\t\t\t \r\n\n\t\r\n\n\t\r\n\n\t\r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n\t\n\t\r\n\n\t\r\n\n\t\r\n\n\t\r\n\t\t\n\t\t\n\t\t\t\n\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t This page was last modified on 20 February 2018, at 22:40.\n\t\t\t\t\t\t\t\t\tThis page has been accessed 4,575 times.\n\t\t\t\t\t\t\t\t\tContent is available under a Creative Commons Attribution-ShareAlike 4.0 International License unless otherwise noted.\n\t\t\t\t\t\t\t\t\tPrivacy policy\n\t\t\t\t\t\t\t\t\tAbout LIMSWiki\n\t\t\t\t\t\t\t\t\tDisclaimers\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\t\n\n","ec047b57c5e01fb4daaaffc7b376efce_html":"<body class=\"mediawiki ltr sitedir-ltr ns-206 ns-subject page-Journal_Big_data_management_for_cloud-enabled_geological_information_services skin-monobook action-view\">\n<div id=\"rdp-ebb-globalWrapper\">\n\t\t<div id=\"rdp-ebb-column-content\">\n\t\t\t<div id=\"rdp-ebb-content\" class=\"mw-body\" role=\"main\">\n\t\t\t\t<a id=\"rdp-ebb-top\"><\/a>\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t<h1 id=\"rdp-ebb-firstHeading\" class=\"firstHeading\" lang=\"en\">Journal:Big data management for cloud-enabled geological information services<\/h1>\n\t\t\t\t\n\t\t\t\t<div id=\"rdp-ebb-bodyContent\" class=\"mw-body-content\">\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\n\t\t\t\t\t<!-- start content -->\n\t\t\t\t\t<div id=\"rdp-ebb-mw-content-text\" lang=\"en\" dir=\"ltr\" class=\"mw-content-ltr\">\n\n\n<h2><span class=\"mw-headline\" id=\"Abstract\">Abstract<\/span><\/h2>\n<p><a href=\"https:\/\/www.limswiki.org\/index.php\/Cloud_computing\" title=\"Cloud computing\" target=\"_blank\" class=\"wiki-link\" data-key=\"fcfe5882eaa018d920cedb88398b604f\">Cloud computing<\/a> as a powerful technology of performing massive-scale and complex computing plays an important role in implementing <a href=\"https:\/\/www.limswiki.org\/index.php\/Geoinformatics\" title=\"Geoinformatics\" target=\"_blank\" class=\"wiki-link\" data-key=\"2dc37de467d4af308f4b02d8e2ba12d1\">geological information services<\/a>. In the era of big data, data are being collected at an unprecedented scale. Therefore, to ensure successful data processing and analysis in cloud-enabled geological information services (CEGIS), we must address the challenging and time-demanding task of big data processing. This review starts by elaborating the system architecture and the requirements for big data management. This is followed by the analysis of the application requirements and technical challenges of big data management for CEGIS in China. This review also presents the application development opportunities and technical trends of big data management in CEGIS, including collection and preprocessing, storage and management, analysis and mining, parallel computing-based cloud platforms, and technology applications.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Introduction\">Introduction<\/span><\/h2>\n<p>In the era of big data, the data-driven modeling method enables us to exploit the potential of massive amounts of geological data easily.<sup id=\"rdp-ebb-cite_ref-VermeeschMaking15_1-0\" class=\"reference\"><a href=\"#cite_note-VermeeschMaking15-1\" rel=\"external_link\">[1]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-ChenQuant16_2-0\" class=\"reference\"><a href=\"#cite_note-ChenQuant16-2\" rel=\"external_link\">[2]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-ZhuCyber16_3-0\" class=\"reference\"><a href=\"#cite_note-ZhuCyber16-3\" rel=\"external_link\">[3]<\/a><\/sup> In particular, by mining the data scientifically, one can offer new services that bring higher value to customers. Furthermore, it is now possible to implement the transition from digital geology to intelligent geology by integrating multiple systems in geological research through the use of big data and other technologies.<sup id=\"rdp-ebb-cite_ref-KimCloud13_4-0\" class=\"reference\"><a href=\"#cite_note-KimCloud13-4\" rel=\"external_link\">[4]<\/a><\/sup>\n<\/p><p>The application of geological data management in the cloud makes it possible to fully utilize structured and unstructured data, including geology, minerals, geophysics, geochemistry, remote sensing, terrain, topography, vegetation, architecture, hydrology, disasters, and other digital geological data distributed in every place on the surface of the earth.<sup id=\"rdp-ebb-cite_ref-KimCloud13_4-1\" class=\"reference\"><a href=\"#cite_note-KimCloud13-4\" rel=\"external_link\">[4]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-ChenTheCons15_5-0\" class=\"reference\"><a href=\"#cite_note-ChenTheCons15-5\" rel=\"external_link\">[5]<\/a><\/sup> Moreover, the geological cloud will enable the integration of data collection, resource integration, data transmission, <a href=\"https:\/\/www.limswiki.org\/index.php\/Information\" title=\"Information\" target=\"_blank\" class=\"wiki-link\" data-key=\"6300a14d9c2776dcca0999b5ed940e7d\">information<\/a> extraction, and knowledge mining, which will pave the way for the transition from data to information, from information to knowledge, and from knowledge to wisdom. In addition, it supports <a href=\"https:\/\/www.limswiki.org\/index.php\/Data_analysis\" title=\"Data analysis\" target=\"_blank\" class=\"wiki-link\" data-key=\"545c95e40ca67c9e63cd0a16042a5bd1\">data analysis<\/a>, mining, organization, and management services for the scientific management of land resources, prospecting breakthrough strategic action and social services, while conducting multilevel, multiangle, and multiobjective demonstration applications on geological data for government decision-making, scientific research, and public services.<sup id=\"rdp-ebb-cite_ref-ChenTheCons15_5-1\" class=\"reference\"><a href=\"#cite_note-ChenTheCons15-5\" rel=\"external_link\">[5]<\/a><\/sup>\n<\/p><p>Big data technologies are bringing unprecedented opportunities and challenges to various application areas, especially to geological information processing.<sup id=\"rdp-ebb-cite_ref-ChenQuant16_2-1\" class=\"reference\"><a href=\"#cite_note-ChenQuant16-2\" rel=\"external_link\">[2]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-LiTheTech10_6-0\" class=\"reference\"><a href=\"#cite_note-LiTheTech10-6\" rel=\"external_link\">[6]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-WuAGeo15_7-0\" class=\"reference\"><a href=\"#cite_note-WuAGeo15-7\" rel=\"external_link\">[7]<\/a><\/sup> Under these circumstances, there are some advancements achieved in the development of this area.<sup id=\"rdp-ebb-cite_ref-EvangelidisGeo14_8-0\" class=\"reference\"><a href=\"#cite_note-EvangelidisGeo14-8\" rel=\"external_link\">[8]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-HuangGreen18_9-0\" class=\"reference\"><a href=\"#cite_note-HuangGreen18-9\" rel=\"external_link\">[9]<\/a><\/sup> Furthermore, from various disciplines of science and engineering, there has been a growing interest in this research field related to geological data generated in the geological information services (GIS). We analyzed the number of those documents indexed in the \u201cWeb of Science\u201d research database.<sup id=\"rdp-ebb-cite_ref-WoS_10-0\" class=\"reference\"><a href=\"#cite_note-WoS-10\" rel=\"external_link\">[10]<\/a><\/sup> In Figures 1 and 2, we can easily find that, in the past ten years, the number of those documents in which \u201cgeological data\u201d is in the title and in the topic is increasing, respectively. Hence, geological data analysis in GIS is an interesting and important research topic currently.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig1_ZhuSciProg2018_2018-2018.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"b892fcdcb50f2035df01603d795b57a6\"><img alt=\"Fig1 ZhuSciProg2018 2018-2018.png\" src=\"https:\/\/www.limswiki.org\/images\/c\/c0\/Fig1_ZhuSciProg2018_2018-2018.png\" width=\"339\" height=\"198\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 1.<\/b> The trend of the number of documents in which \u201cgeological data\u201d is in the title, from 2007 to 2016<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p><a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig2_ZhuSciProg2018_2018-2018.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"94b5ca58d0ec9873bac2b3abbc35e978\"><img alt=\"Fig2 ZhuSciProg2018 2018-2018.png\" src=\"https:\/\/www.limswiki.org\/images\/b\/bb\/Fig2_ZhuSciProg2018_2018-2018.png\" width=\"343\" height=\"199\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 2.<\/b> The trend of the number of documents in which \u201cgeological data\u201d is in the topic, from 2007 to 2016.<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>Considering the development status of cloud-enabled geological information services (CEGIS) and the application requirements of big data management analysis, this article describes the significant impact and revolution on GIS brought by the advancement of big data technologies. Furthermore, this article outlines the future application development and technology development trend of big data management analysis in CEGIS.\n<\/p><p>The remainder of this article is organized as follows. In the next section we provide a review on CEGIS, with an emphasis on the descriptions for the system architecture and those requirements from big data management. Then, the challenges for big data management in CEGIS are presented. The key technologies and trends on big data management in CEGIS are analyzed afterwards, and finally we draw conclusions from the research.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Review_on_cloud-enabled_geological_information_services\">Review on cloud-enabled geological information services<\/span><\/h2>\n<p>The construction of a geological cloud differs from the current big data analysis based on the internet of things (IoT). Having a deep understanding of data characteristics is necessary to collect, process, analyze, and interpret data in different fields, because the nature and types of data vary in different fields and in different problems. Geology is a data-intensive science, and geological data are characterized with multisource heterogeneity, spatiotemporal variation, correlation, uncertainty, fuzziness, and nonlinearity. Therefore, the geological cloud has a certain degree of confidentiality and it is highly domain-specific; meanwhile, it is developed on the basis of a large amount of geological data accumulated over a long period of time.<sup id=\"rdp-ebb-cite_ref-ChenTheCons15_5-2\" class=\"reference\"><a href=\"#cite_note-ChenTheCons15-5\" rel=\"external_link\">[5]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-YangUtilizing17_11-0\" class=\"reference\"><a href=\"#cite_note-YangUtilizing17-11\" rel=\"external_link\">[11]<\/a><\/sup> There are many real-time data generated from geological disasters and the geological environment. The geological cloud includes core basic data, which can be divided into three parts: an existing structured database, some unstructured data, and public application data. Therefore, it is important to take good advantage of the existing traditional structured data, use the big data technologies to deal with the relevant unstructured data, and also consider the peripheral public data.\n<\/p><p>Geological big data are multidimensional, and they consist of both structured and unstructured data.<sup id=\"rdp-ebb-cite_ref-WuAKnow17_12-0\" class=\"reference\"><a href=\"#cite_note-WuAKnow17-12\" rel=\"external_link\">[12]<\/a><\/sup> The technical methods of big data analysis differ greatly from those of professional databases. Long-term geological survey and study have yeilded years of geological information, forming a rich and professional database, which is an important fundamental assurance for land and resources science management, geological survey, and geological information public service.<sup id=\"rdp-ebb-cite_ref-TanArchi16_13-0\" class=\"reference\"><a href=\"#cite_note-TanArchi16-13\" rel=\"external_link\">[13]<\/a><\/sup> This \u201cprofessional cloud\u201d objectively requires technology research and development, such as the construction of a professional local area network, a data sharing platform, and geological big data <a href=\"https:\/\/www.limswiki.org\/index.php\/Data_visualization\" title=\"Data visualization\" target=\"_blank\" class=\"wiki-link\" data-key=\"4a3b86cba74bc7bb7471aa3fc2fcccc3\">visualization<\/a> services. Hence, the construction of a geological cloud service is closely related to land resource management, deployment decisions, and the application demand of public service. The key technologies of research and development include the following: unstructured data extraction and mining analysis, structured and unstructured data mixed storage and management, big data sharing platform, data transmission, and visualization.<sup id=\"rdp-ebb-cite_ref-YangUtilizing17_11-1\" class=\"reference\"><a href=\"#cite_note-YangUtilizing17-11\" rel=\"external_link\">[11]<\/a><\/sup>\n<\/p><p>Generally, the construction of a geological cloud is a long-term systematic project. This means that it is required to follow the basic principles of \u201cstanding on the reality, focusing on the future\u201d and \u201cfocusing on the long-term and overall situation, embarking on the current and local situation,\u201d in order to achieve the analysis and application of geological cloud public data and core data gradually in accordance with the technical route of big data analysis; thus the construction of a geological cloud will be implemented eventually. For the earth, land and resource management should cover many respects, including human behavior, climate change, development and utilization of various resources, natural disasters, environmental pollution, and the ecosystem cycle. Then the introduction of big data technologies can integrate this type of resource information to provide the ability of uniformly dealing with the problems related to the entire earth information resources, which has a significant effect on the strategic planning of land and resource management.<sup id=\"rdp-ebb-cite_ref-ZhuCyber16_3-1\" class=\"reference\"><a href=\"#cite_note-ZhuCyber16-3\" rel=\"external_link\">[3]<\/a><\/sup>\n<\/p><p>The geological cloud is an important component of the scientific process for geological data research. The ultimate goal for developing a geological cloud is to better describe and understand the complex earth system and geological framework, provide the scientific basis for the description of the land surface and the biodiversity characteristics of the earth, and improve the ability to deal with complex social problems.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"System_architecture\">System architecture<\/span><\/h3>\n<p>Because the business service functions of each country differ, the system architecture of the geological cloud also will vary. In Figure 3, we present a system architecture, using China as an example.<sup id=\"rdp-ebb-cite_ref-TanArchiInvest16_14-0\" class=\"reference\"><a href=\"#cite_note-TanArchiInvest16-14\" rel=\"external_link\">[14]<\/a><\/sup>\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig3_ZhuSciProg2018_2018-2018.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"c6a66af2189ea27fe8da0730e333abb2\"><img alt=\"Fig3 ZhuSciProg2018 2018-2018.png\" src=\"https:\/\/www.limswiki.org\/images\/c\/c0\/Fig3_ZhuSciProg2018_2018-2018.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 3.<\/b> The system architecture of a geological cloud<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>The geological cloud combines the geological survey intranet and the geological survey extranet. It enables the sharing and management of computing resources, storage resources, network resources, software resources, and geological data resources.<sup id=\"rdp-ebb-cite_ref-HeProto14_15-0\" class=\"reference\"><a href=\"#cite_note-HeProto14-15\" rel=\"external_link\">[15]<\/a><\/sup>\n<\/p><p>The geological cloud can be summarized as having the following characteristics<sup id=\"rdp-ebb-cite_ref-TanArchiInvest16_14-1\" class=\"reference\"><a href=\"#cite_note-TanArchiInvest16-14\" rel=\"external_link\">[14]<\/a><\/sup>:\n<\/p><p>(i) \u201cOne platform: The geological cloud management platform\u201d: It uniformly manages computing resources, storage resources, network resources, software resources, and geological data resources.\n<\/p><p>(ii) \u201cTwo networks: The geological survey intranet and the geological survey extranet\u201d: Here, the intranet is constructed by creating a network that is physically isolated from the internet. The intranet is developed on the basis of the existing geological survey network, and each node is linked through a dedicated line or bare fiber. All of the internal business management systems, software systems, and data are deployed on the internet, providing services to 28 local units and those users of more than 350 geological survey projects. Facilitated by the public geological survey network, the geological survey business management system, geological data information service system, and public geological data can be deployed on the extranet accessed by the general public. The communication between the intranet and the extranet, including data exchange and audit, can be carried out via single-directional light gate.\n<\/p><p>iii) \u201cOne main node and three domain-specific nodes\u201d: One main node is constructed at the China Geological Survey Development Research Center. In addition, three domain-specific nodes \u2014 namely the marine node, geological environment node, and aviation geophysical exploration and remote sensing node \u2014 are constructed, respectively. Each node is configured with the corresponding servers, storage equipment, network equipment, management platform, large-scale specialized data processing software system, and various customized applications. Each node would store huge amounts of geological data and conform to current data security standards. The master node and the domain-specific nodes are linked via optical fibers. The master node will consist of 200 computing nodes with three\u2009petabytes of storage capacity and will be equipped with some geological data processing software system. The master node will be hosted in a medium-sized supercomputing center, and it will provide support for the three-dimensional seismic exploration data processing and other large-scale computing. The three domain-specific nodes are to maintain their scale in the near future to facilitate reasonable scheduling and efficient utilization of information resources and data resources.\n<\/p><p>A system for geological survey business management and auxiliary decision-making is deployed on the extranet. The system provides a real-time tracking and management function for geological survey projects and various resources.\n<\/p><p>Main users of the geological cloud include institutional users, geological survey project users, and users from the general public. The institutional users can store the current geological database and newly collected data in the geological cloud through the geological survey business network and can obtain the geological data of other institutions from the cloud as needed. The geological survey project users can access the cloud geological background data through 4G or satellite lines and can collect data through the data collection system.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Requirement_from_big_data_management\">Requirement from big data management<\/span><\/h3>\n<p>The construction of a geological cloud must meet customer demand. Big data technologies are then used as the means to implement the geological cloud.\n<\/p><p>The types and quantity of geological data have been continuously growing over the years. Geological data include all kinds of electronic documents, structured, semistructured, and unstructured data, such as various databases (map database, spatial database, and attribute database), pictures, tables, video, and audio. Generally speaking, those important data may be buried in the massive dataset without the guidance for requirements. Hence, the first step is to understand the user requirements and then gain the capability of large-scale data processing. This is followed by data mining, algorithms, and analysis, which will ultimately generate value. Big data technologies in the field of geography must meet different needs from people at different levels, including the public demand of the geologic data services and professional data demand for geological research institutions, as well as related enterprises and government departments.<sup id=\"rdp-ebb-cite_ref-ZhuAFrame15_16-0\" class=\"reference\"><a href=\"#cite_note-ZhuAFrame15-16\" rel=\"external_link\">[16]<\/a><\/sup>\n<\/p><p>On the basis of big data analysis technologies, a complete data link is formed connecting data, information, knowledge, and service, through the use of an advanced cloud computing system, IoT, and big data processing flow. It is shown in Figure 4.<sup id=\"rdp-ebb-cite_ref-ChenTheCons15_5-3\" class=\"reference\"><a href=\"#cite_note-ChenTheCons15-5\" rel=\"external_link\">[5]<\/a><\/sup>\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig4_ZhuSciProg2018_2018-2018.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"9371a0cad10f5a5f34b8b7bda918ca29\"><img alt=\"Fig4 ZhuSciProg2018 2018-2018.png\" src=\"https:\/\/www.limswiki.org\/images\/3\/33\/Fig4_ZhuSciProg2018_2018-2018.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 4.<\/b> Schematic diagram of big data analysis<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h2><span class=\"mw-headline\" id=\"Challenges_for_big_data_management_in_cloud-enabled_geological_information_services\">Challenges for big data management in cloud-enabled geological information services<\/span><\/h2>\n<p>Geological big data are generated regarding various layers of the earth, the history of the conformation and evolution of the earth, and the material composition of the earth and its changes. It also involves the exploration and utilization of mineral resources. In the current geological work, the collection, mining, processing, analysis, and utilization of various complex type data are closely related to those general big data. The \u201c4V\u201d characteristics of big data \u2014 namely volume, velocity, variety, and veracity \u2014 also apply to geological big data.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Volume\">Volume<\/span><\/h3>\n<p>Currently, there is no consensus on the collective size of geological data. Geological big data include geology, minerals, remote sensing, geophysical exploration, geochemical exploration, surveying, and mapping data, which are interconnected and integrated. In terms of the number of mines, there are at least 70,000 in China, and some official documents and popular science books indicate that there are more than 200,000 deposits and minerals that have been found. This collection of information is huge and cannot be processed using conventional tools. For example, an Excel spreadsheet cannot contain all the information of 70,000 mining areas. As such, it is difficult to classify and rank the 200,000 mines, so it is necessary to rely on the concepts and technologies of big data.<sup id=\"rdp-ebb-cite_ref-WangChar16_17-0\" class=\"reference\"><a href=\"#cite_note-WangChar16-17\" rel=\"external_link\">[17]<\/a><\/sup>\n<\/p><p>Especially in recent years, images, video, and other types of data have emerged on a large scale. With the application of 3D scanning and other devices, the data volume has been increasing exponentially. The ability to describe the data is more and more powerful, and the data are gradually approximated to the real world. In addition, the large amount of data is also reflected in the aspect that the methods and ideas used by people to deal with data have undergone a fundamental change. In the early days, people used the sampling method to process and analyze data in order to approximate the objective with a small number of subsample data. With the development of technologies, the number of samples gradually approaches the overall data. Using all the data can lead to a higher accuracy, which can explain things in more detail.<sup id=\"rdp-ebb-cite_ref-PanMan17_18-0\" class=\"reference\"><a href=\"#cite_note-PanMan17-18\" rel=\"external_link\">[18]<\/a><\/sup>\n<\/p><p>Recently, the China geological survey system has built many databases, including<sup id=\"rdp-ebb-cite_ref-WangChar16_17-1\" class=\"reference\"><a href=\"#cite_note-WangChar16-17\" rel=\"external_link\">[17]<\/a><\/sup>:\n<\/p>\n<ul><li> a regional geological database (covering the 1\u2009:\u20092500000, 1\u2009:\u20091000000, 1\u2009:\u2009500000, 1\u2009:\u2009250000, and 1\u2009:\u2009200000 regional geological map; the national 1\u2009:\u2009200000 natural sand; the isotope geological dating; and the lithostratigraphic unit database);<\/li>\n<li> a basic geological database (covering the national rock property database and national geological working degree database);<\/li>\n<li> a mineral resources database (covering the national mineral resources, the national mineral resources utilization survey mining resources reserves verification results, the national survey of large and medium-sized mines, the prospect of mineral resources, the survey of the resources potential of major solid mineral resources in China, and the geological and mineral resources database);<\/li>\n<li> an oil and gas energy database (covering the oil and gas basins in China, the geological survey results of the national oil and gas resources, the national petroleum and geophysical exploration, national shale gas, national coal bed methane, national natural gas hydrate, and other databases);<\/li>\n<li> a geophysical database (covering 1\u2009:\u20091 million, 1\u2009:\u2009500000, 1\u2009:\u2009250000, 1\u2009:\u2009200000, and 1\u2009:\u200950000 gravity, national regional gravity, national aeromagnetism, national ground magnetism, national electrical survey, seismic survey, national aviation radioactivity, and national logging database);<\/li>\n<li> a geochemical database (covering the databases of national 1\u2009:\u2009250000 and 1\u2009:\u200920 geochemical exploration, national multiobjective geochemical and national land quality evaluation results);<\/li>\n<li> a remote sensing survey database (covering national aeronautical remote sensing image, China resources satellite data, space remote sensing image, national mine environmental remote sensing monitoring, national high score satellite, and other databases);<\/li>\n<li> a drilling database (covering the national geological borehole information, the national important geological borehole, the Chinese mainland scientific drilling core scanning image library, and so on);<\/li>\n<li> a hydraulic cycle hazards database;<\/li>\n<li> a data literature database;<\/li>\n<li> a special subject database (covering the national mineral resources potential evaluation database, the important mineral \u201cthree-rate\u201d investigation and evaluation database); and<\/li>\n<li> a work management database (covering the national exploration right, mining right, mining right verification, geological information metadata database, and many others).<\/li><\/ul>\n<p>Those databases are still expanding and consummating, and their practical values have not yet been fully reflected. However, it's virtually impossible for the vast majority of researchers to have all of the above data, at most, using their own accumulated data. Even if their accumulated data, both on the quantity and on type, is incomparable to 10 years and 20 years ago, they are, in fact, in the era of \u201crelatively big data.\u201d From 1999 to 2004, for example, in \u201cthe Chinese mineralization system and regional metalorganic evaluation\u201d project, although there are 202 national academic experts that participated in it, they only contain master data of 4500 properties (all kinds of minerals). From 2006 to 2013, the study of \u201cnational important mineral and regional mineralization laws\u201d was conducted; meanwhile, the mining area covered only by the mineral resources research institute was 30,600. Therefore, the increase of information and the amount of data are unprecedented in the last 10 years.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Variety\">Variety<\/span><\/h3>\n<p>From a formal point-of-view, geological big data have many characteristics, including multidimensionality, multiscale, and multitenses. And they contain structured, semistructured, and unstructured data and usually are stored in forms of text, graphics, images, databases (including image databases, spatial databases, and attribute databases), tables, videos, and audio, often in a fragmented state. For example, a large number of field outcrop description, borehole core description, geological survey, exploration report, geological mapping, drawing, and photo data were stored and managed in the form of paper for a long time. Even the numerous relational databases and spatial databases were primarily used to store and manage structured data that are tabulated and vectorized, while the text descriptions, records, and summaries were directly stored. Very few standardized processing and structural transformations were performed. Furthermore, there is no tool available to effectively integrate storage and manage structured, semistructured, and unstructured data.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Velocity\">Velocity<\/span><\/h3>\n<p>The increase of geological data is very fast, especially in remote sensing geology, aviation geophysical exploration, regional geochemical exploration, and other fields, due to the introduction of new technologies and new methods. Meanwhile, high-speed processing is also a characteristic of big data. In addition to the need of analyzing data in real time, people also need to describe the results of data mining and processing through the use of several data processing techniques, such as image and video, while requiring effective and efficient handling skills. For example, the detection of deep earth information not only needs to obtain parameters of the seismic wave reflection and refraction but also needs to conduct quick processing, so as to timely predict whether earthquakes will occur and forecast the time, location, and intensity. In this way, we can avoid the disaster effectively. When applying a variety of data to a particular mountain, one should learn which ones have spatial limitations and which are not related to spatiality, so that one deduces the metallogenic law and guides the prospecting better.<sup id=\"rdp-ebb-cite_ref-PanMan17_18-1\" class=\"reference\"><a href=\"#cite_note-PanMan17-18\" rel=\"external_link\">[18]<\/a><\/sup>\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Veracity\">Veracity<\/span><\/h3>\n<p>For the understanding of the value of big data, most people consider it low-value density. It means that the real useful information in the vast amount of data is very little. Taking video as an example, the useful data may be only a second or two in the continuous monitoring process. While big data is high-value, it does not need to be too invested; just collecting information from the internet can bring business value. Therefore, big data has the characteristics of low-value density and high business value. The same is true for geological big data. So far, there has been a lot of information about geophysical prospecting, but only a few have been confirmed, and the discovered mines were less. But once a breakthrough was made, its socioeconomic value was enormous, such as the lithium polymetallic deposit in Tibet and the newly discovered Jima copper polymetallic deposit in the outskirts of Sichuan.<sup id=\"rdp-ebb-cite_ref-PanMan17_18-2\" class=\"reference\"><a href=\"#cite_note-PanMan17-18\" rel=\"external_link\">[18]<\/a><\/sup>\n<\/p><p>In addition, the spatial attribute and temporal attribute of geological data also bring a big challenge to data accuracy. Any geological data have spatial attributes, and their values are reflected in the spatial law of distribution of mineral resources. For this reason, in the process of establishing the metallogenic series, exploring the metallogenic law, and constructing the mathematical model, the spatial attribute of the metallogenic model should be considered. Obviously, every metallogenic series has its own spatial attributes. Geological data also has the time attribute, which is very different from physical, chemical, and other natural sciences. One of the fundamental pillars of geology is the geological time scale. The rocks, strata, and deposits of different geological periods have different distribution characteristics and regularity, so those data have their own time attribute.\n<\/p><p>It is obvious that those characteristics of geological big data mentioned above impose very challenging obstacles to the data management in CEGIS. The challenges related to geological big data management can be summarized as follows:\n<\/p><p>(i) It is quite difficult to describe and model geological big data since there are few effective description mechanisms for characteristics and object modeling approaches under the cloud computing environment.\n<\/p><p>(ii) There remain many technical issues that must be addressed to fully manage, mine, analyze, integrate, and share those geological big data, in consideration of those complex characteristics, including multi-source heterogeneous data, highly spatiotemporal variation, high-volume and high-correlation data, and many others.\n<\/p><p>(iii) Many issues appear in achieving decision support, such as data incompleteness, data uncertainty, and high-dimensionality of data.\n<\/p><p>The broad range of challenges described here make good topics for research within the field of big data management in CEGIS. They are analyzed in the next section.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Key_technologies_and_trends_on_big_data_management_in_cloud-enabled_geological_information_services_.28CEGIS.29\">Key technologies and trends on big data management in cloud-enabled geological information services (CEGIS)<\/span><\/h2>\n<p>With the rapid advancement of big data technologies, some key technologies are accordingly developed for big data management in CEGIS. Specifically, a schematic diagram of those key technologies is shown in Figure 5. In this section we present an analysis on those key technologies, along with discussion of key related trends.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig5_ZhuSciProg2018_2018-2018.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"1982bdb5201d71401ef7945f80e0fbe7\"><img alt=\"Fig5 ZhuSciProg2018 2018-2018.png\" src=\"https:\/\/www.limswiki.org\/images\/9\/9b\/Fig5_ZhuSciProg2018_2018-2018.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 5.<\/b> Schematic diagram of key technologies for big data management in CEGIS<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h3><span class=\"mw-headline\" id=\"Geological_big_data_collection_and_preprocessing\">Geological big data collection and preprocessing<\/span><\/h3>\n<p>Geological big data collection and preprocessing aim to categorize those geological big data obtained from geological data, geological information, and geological literature.\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Geological_data_collection_access\">Geological data collection access<\/span><\/h4>\n<p>In addition to the traditional collection ways, it is also required to carry out large-scale network information access and provide real-time, high concurrency, and fast web content acquisition, combining with the application characteristics in the cloud environment. Currently, considering that the growth rate of geological information generated from the network is very fast, the big data analysis system should obtain relevant data quickly.\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Quality_and_usability_characteristics_of_geological_data\">Quality and usability characteristics of geological data<\/span><\/h4>\n<p>It needs to distinguish and identify valuable information through intelligent discovery and management technologies. Because the information value density contained in different data sources differs from each other, filtering out the useless or low-value data source can effectively reduce the data storage and processing costs. Then, it can also further improve the efficiency and accuracy of analysis.\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Geological_data_entity_recognition_model\">Geological data entity recognition model<\/span><\/h4>\n<p>According to the subject domain of geology, the distributed data are extracted to form a data warehouse, after conducting the operation of processing and integration. When extracting data in the field of geology, it needs to use an entity modeling method to abstract entities from the vast number of data, so as to find out the relationship between those entities. This approach ensures that the data used in warehouse data can be consistent and relevant in accordance with the data model.<sup id=\"rdp-ebb-cite_ref-YangTheRes14_19-0\" class=\"reference\"><a href=\"#cite_note-YangTheRes14-19\" rel=\"external_link\">[19]<\/a><\/sup> These recognized data are directly input into the system and stored as metadata, which could be used for data management and analysis.\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Aggregation_of_geological_big_data\">Aggregation of geological big data<\/span><\/h4>\n<p>Generally, different data sources and even the same data source may generate data with different formats. As mentioned above, because these structural, semistructured, and unstructured multimodal geological big data are integrated together, the data heterogeneity is obvious in big data analysis. Data aggregation, as the key technology in achieving data extraction and transformation<sup id=\"rdp-ebb-cite_ref-LuoAKern16_20-0\" class=\"reference\"><a href=\"#cite_note-LuoAKern16-20\" rel=\"external_link\">[20]<\/a><\/sup>, enables data sharing and data fusion between heterogeneous data sources. Through the use of heterogeneous information aggregation technologies, unified data retrieval and data presentation could be achieved. After aggregating those distributed heterogeneous data sources, they are extracted and converted to achieve the functions of automatically constructing a subject domain database and data warehouse.<sup id=\"rdp-ebb-cite_ref-KuoInter16_21-0\" class=\"reference\"><a href=\"#cite_note-KuoInter16-21\" rel=\"external_link\">[21]<\/a><\/sup>\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Management_of_geological_big_data_evolution_tracking_records\">Management of geological big data evolution tracking records<\/span><\/h4>\n<p>In order to effectively utilize geological big data, it needs to track the evolution of big data during the whole life cycle of GIS, with the purpose of achieving the traceable big data management.\n<\/p><p>Here, we provide an example of aggregating and collecting geological big data in CEGIS; Figure 6 illustrates this process. While developing CEGIS, all kinds of geological data should be processed. Through the use of the geological cloud, big data are collected, and then they are aggregated to achieve some key functions in the geological information service platform, including catalog sharing, intelligent searching, data products release, and collaborative service.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig6_ZhuSciProg2018_2018-2018.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"560fc3ba6128ee97568e897646f10894\"><img alt=\"Fig6 ZhuSciProg2018 2018-2018.png\" src=\"https:\/\/www.limswiki.org\/images\/2\/21\/Fig6_ZhuSciProg2018_2018-2018.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 6.<\/b> An example of aggregating and collecting geological big data in CEGIS<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h3><span class=\"mw-headline\" id=\"Geological_big_data_storage_and_management\">Geological big data storage and management<\/span><\/h3>\n<p>From the data collection perspective, geological data can be divided into field survey data, drilling and engineering exploration data, remote detection data, analytical test data, and comprehensive study data. From the angle of comprehensive application fields, they can be also divided into regional geological survey data, energy and mineral resources evaluation and exploration survey results data, geological disaster monitoring and early warning data, geological environment survey and evaluation results data, and marine geological survey and evaluation data. From the data formality point of view, they can be divided into picture data, text report data, tabular data, and image data. These data are collected by different units.\n<\/p><p>Facing these complex geological big data mentioned above, the traditional relational database will find it difficult to handle them, while the distributed storage system can be used more effectively to store such huge amounts of data and manage them. The data system places the massive data in many machines, which avoids such limitations of storage capacity, though it also brings many problems that have not occurred before in stand-alone systems. Hence, some distributed data storage solutions have accordingly emerged, including Hadoop, Spark, and other nonrelational database systems (like HBase, MongoDB, and many others).<sup id=\"rdp-ebb-cite_ref-WangKey11_22-0\" class=\"reference\"><a href=\"#cite_note-WangKey11-22\" rel=\"external_link\">[22]<\/a><\/sup> These different solutions satisfy the specific requirements from different applications. When applying to the analysis of big data, different solutions can be employed according to the specific needs of different intelligence analysis. Furthermore, different solutions can be combined to meet specific needs. Actually, there have been some attempts to develop combination strategies for distributed storage models, varying in the big data management performance requirement, and the complexity of collected big data that are supported by the distributed storage system.<sup id=\"rdp-ebb-cite_ref-ArmbrustAbove09_23-0\" class=\"reference\"><a href=\"#cite_note-ArmbrustAbove09-23\" rel=\"external_link\">[23]<\/a><\/sup> Hence, there is still a room for improvement and optimization of geological big data storage, while designing a hybrid distributed storage model through the use of cloud's advantages of flexibly scalable deployment, to meet the users\u2019 requirement for geological big data resource management with satisfactory data durability and high availability.<sup id=\"rdp-ebb-cite_ref-ArmbrustAbove09_23-1\" class=\"reference\"><a href=\"#cite_note-ArmbrustAbove09-23\" rel=\"external_link\">[23]<\/a><\/sup>\n<\/p><p>Here, the hot research topics include the following:\n<\/p><p>(i) For geological applications, load optimization storage should be implemented to achieve the coupling of data storage and application, as well as the coupling of distributed file system and the new storage system.\n<\/p><p>(ii) Based on the application characteristics of distributed databases, more studies could be conducted on the application of new databases such as NoSQL and NewSQL in geological survey work.\n<\/p><p>With the development of big data technologies, more and more mature distributed data storage solutions will emerge and will be applied to big data analysis.<sup id=\"rdp-ebb-cite_ref-XiaDesign14_24-0\" class=\"reference\"><a href=\"#cite_note-XiaDesign14-24\" rel=\"external_link\">[24]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-HuaData15_25-0\" class=\"reference\"><a href=\"#cite_note-HuaData15-25\" rel=\"external_link\">[25]<\/a><\/sup>\n<\/p><p>Specifically, in the management of geological big data, the implementation of data query \u2014 for example, spatial query \u2014 has been a long-term focus. Generally, considering those advantages with unified modeling language (UML) and computer-aided software engineering (CASE) methodology, the spatial database could be accordingly designed and implemented to characterize and realize the object-oriented spatial vector big data firstly.<sup id=\"rdp-ebb-cite_ref-JiaDesign10_26-0\" class=\"reference\"><a href=\"#cite_note-JiaDesign10-26\" rel=\"external_link\">[26]<\/a><\/sup> And then, in the developed spatial database, the function of self-generating codes would be achieved to realize two-way spatial query between graphic-objects and property data.<sup id=\"rdp-ebb-cite_ref-JiaDesign10_26-1\" class=\"reference\"><a href=\"#cite_note-JiaDesign10-26\" rel=\"external_link\">[26]<\/a><\/sup> Moreover, in consideration of the complex characteristics of geological big data, the spatial query is achieved finally through the use of Flex technology in the ArcGIS Server software platform.<sup id=\"rdp-ebb-cite_ref-AGIS_27-0\" class=\"reference\"><a href=\"#cite_note-AGIS-27\" rel=\"external_link\">[27]<\/a><\/sup> Practically speaking, with this technology, the spatial query could be implemented through two functions, including \u201cQuery\u201d and \u201cFind\u201d query methods.<sup id=\"rdp-ebb-cite_ref-ZhouDesign13_28-0\" class=\"reference\"><a href=\"#cite_note-ZhouDesign13-28\" rel=\"external_link\">[28]<\/a><\/sup>\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Geological_big_data_analysis_and_mining\">Geological big data analysis and mining<\/span><\/h3>\n<p>In terms of geological data analysis and mining, it needs to combine geological data, geological information, and geological literature, through the analysis of geological application demand of real-time mining, to explore geological big data environment analysis and mining algorithms, in an effort to fully achieve the goal of intelligent mining for geological big data.\n<\/p><p>Figure 7 shows a schematic diagram of discovering geological knowledge through analyzing and mining geological big data. It can be readily seen that geological big data analysis and mining play an important role in achieving the final goal. \n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig7_ZhuSciProg2018_2018-2018.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"e53ec22c4f68c3341f49ee60a444d1c8\"><img alt=\"Fig7 ZhuSciProg2018 2018-2018.png\" src=\"https:\/\/www.limswiki.org\/images\/d\/d7\/Fig7_ZhuSciProg2018_2018-2018.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 7.<\/b> Schematic diagram of discovering geological knowledge through analyzing and mining geological big data<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>More relevant research work related to it mainly involves the following aspects.\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Geological_big_data_analysis\">Geological big data analysis<\/span><\/h4>\n<p>Considering the special applications, geological big data technologies would apply big data concepts to analyze the metallogenic rules by making full use of various data related to ore, to recognize deposit metallogenic series, to summarize the metallogenic regularities and express in an appropriate way (like voice, image, and many others), and to establish the scientifically mathematical model. The model then uses new exploration data to predict future data and to guide geological prospecting.\n<\/p><p>In addition, it is necessary to pay special attention to the analysis of new geological big data information collected from social medium and networks.<sup id=\"rdp-ebb-cite_ref-HuangOpp16_29-0\" class=\"reference\"><a href=\"#cite_note-HuangOpp16-29\" rel=\"external_link\">[29]<\/a><\/sup> These include the geological text information flow data from microblog web sites, the geological multimedia data from media sharing web sites, the geology-related user interaction data on social networking web sites, and many others.<sup id=\"rdp-ebb-cite_ref-JinComm15_30-0\" class=\"reference\"><a href=\"#cite_note-JinComm15-30\" rel=\"external_link\">[30]<\/a><\/sup> These multisource data complement traditional big data. Specifically, such data should be addressed with the help of multilingual information processing, multilingual machine translation, and social network cross-language retrieval.<sup id=\"rdp-ebb-cite_ref-YangMana11_31-0\" class=\"reference\"><a href=\"#cite_note-YangMana11-31\" rel=\"external_link\">[31]<\/a><\/sup> Big data analysis of such data is a key to deep use of geological data in a broader dimension. With the maturity of big data analysis technologies, it becomes possible to analyze and extract valuable information from these data<sup id=\"rdp-ebb-cite_ref-LuoAQuant17_32-0\" class=\"reference\"><a href=\"#cite_note-LuoAQuant17-32\" rel=\"external_link\">[32]<\/a><\/sup> and to provide effective solutions for geological big data applications.\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Geological_big_data_mining\">Geological big data mining<\/span><\/h4>\n<p>Data mining involves the extraction of unknown and useful knowledge and information from the massive multilevel spatiotemporal data and attribute data, using statistics, pattern recognition, artificial intelligence, set theory, fuzzy mathematics, cloud computing, machine learning, visualization, and relevant techniques and methods. Data mining could reveal the relationship and evolution trend behind the geological big data, achieve the automatic or semiautomatic acquisition of the new knowledge, and provide the decision basis for resource prediction, prospecting, environmental assessment, and disaster prevention and mitigation.<sup id=\"rdp-ebb-cite_ref-TseGeo15_33-0\" class=\"reference\"><a href=\"#cite_note-TseGeo15-33\" rel=\"external_link\">[33]<\/a><\/sup> The knowledge is obtained directly from known geological data to provide relevant decision support.<sup id=\"rdp-ebb-cite_ref-ZhuInt17_34-0\" class=\"reference\"><a href=\"#cite_note-ZhuInt17-34\" rel=\"external_link\">[34]<\/a><\/sup> In consideration of the amount of data, it may deal with terabytes or even petabytes of data, as well as multidimensional, noisy, and dynamic data. Because data mining algorithms will directly influence the outcome of the discovered knowledge, selecting the most appropriate algorithms and parallel computing strategy is the key to data mining.\n<\/p><p>Effective data mining also could reduce manual intervention during information processing and make use of methods and tools of big data intelligent analysis.<sup id=\"rdp-ebb-cite_ref-VoData15_35-0\" class=\"reference\"><a href=\"#cite_note-VoData15-35\" rel=\"external_link\">[35]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-GasmiPCA16_36-0\" class=\"reference\"><a href=\"#cite_note-GasmiPCA16-36\" rel=\"external_link\">[36]<\/a><\/sup> Recently, there has been a growing interest in geological big data mining through the use of some novel computational intelligent methods such as rough set<sup id=\"rdp-ebb-cite_ref-LuoResearch12_37-0\" class=\"reference\"><a href=\"#cite_note-LuoResearch12-37\" rel=\"external_link\">[37]<\/a><\/sup> and fuzzy aggregation.<sup id=\"rdp-ebb-cite_ref-FarzamianAWeight16_38-0\" class=\"reference\"><a href=\"#cite_note-FarzamianAWeight16-38\" rel=\"external_link\">[38]<\/a><\/sup> Moreover, with the development of those neural-network-based machine learning algorithms in recent years, popular methods such as extreme learning machine<sup id=\"rdp-ebb-cite_ref-XuEffi17_39-0\" class=\"reference\"><a href=\"#cite_note-XuEffi17-39\" rel=\"external_link\">[39]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-LuoTowards17_40-0\" class=\"reference\"><a href=\"#cite_note-LuoTowards17-40\" rel=\"external_link\">[40]<\/a><\/sup>, approximate dynamic programming<sup id=\"rdp-ebb-cite_ref-LuoOnline15_41-0\" class=\"reference\"><a href=\"#cite_note-LuoOnline15-41\" rel=\"external_link\">[41]<\/a><\/sup>, and kernel learning<sup id=\"rdp-ebb-cite_ref-LuoAQuantKern17_42-0\" class=\"reference\"><a href=\"#cite_note-LuoAQuantKern17-42\" rel=\"external_link\">[42]<\/a><\/sup> could be used to further improve mining effectiveness for geological big data in the future.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Highly_performable_big_data_cloud_computing_platform\">Highly performable big data cloud computing platform<\/span><\/h3>\n<p>A highly performable big data cloud computing platform is the foundation for big data analysis. It enables parallel computing for large-scale incremental real-time data and large-scale heterogeneous data.<sup id=\"rdp-ebb-cite_ref-PassmoreEarth14_43-0\" class=\"reference\"><a href=\"#cite_note-PassmoreEarth14-43\" rel=\"external_link\">[43]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-LiTheSpatial10_44-0\" class=\"reference\"><a href=\"#cite_note-LiTheSpatial10-44\" rel=\"external_link\">[44]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-CruzAuto12_45-0\" class=\"reference\"><a href=\"#cite_note-CruzAuto12-45\" rel=\"external_link\">[45]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-XiaForming15_46-0\" class=\"reference\"><a href=\"#cite_note-XiaForming15-46\" rel=\"external_link\">[46]<\/a><\/sup>\n<\/p><p>With the advent of massive data storage solutions, many big data distributed computing frameworks have been proposed. Among them, Hadoop, MapReduce, Spark, and Storm are the most important distributed computing frameworks. These frameworks have different characteristics and solve different problems in applications.<sup id=\"rdp-ebb-cite_ref-IbrahimEval09_47-0\" class=\"reference\"><a href=\"#cite_note-IbrahimEval09-47\" rel=\"external_link\">[47]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-IqbalBig15_48-0\" class=\"reference\"><a href=\"#cite_note-IqbalBig15-48\" rel=\"external_link\">[48]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-Reyes-OrtizBig15_49-0\" class=\"reference\"><a href=\"#cite_note-Reyes-OrtizBig15-49\" rel=\"external_link\">[49]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-MengMLib16_50-0\" class=\"reference\"><a href=\"#cite_note-MengMLib16-50\" rel=\"external_link\">[50]<\/a><\/sup> Hadoop\/MapReduce is often used for offline complex big data processing, Spark is often employed in offline fast big data processing, and Storm is often available for real-time online big data processing. Different computing frameworks have their different advantages and disadvantages. Hadoop\/MapReduce is easy to program, and it has satisfactory scalability and fault tolerance. In addition, it is suitable for offline processing of massive data at the petabyte level, but it does not support real-time computation and flow calculation. Spark is a memory-based iterative computing framework. By placing intermediate data in memory, Spark can achieve higher iterative calculation performance. The programming model of Spark is more flexible than that of Hadoop\/MapReduce, but Spark is not suitable for those applications in which the fine-grained updates are conducted asynchronously. Hence, Spark may be unavailable for those application models that require incremental changes. Storm is suitable for stream data processing. It can be used to handle a stream of incoming messages and can write the processed results to a specified storage device. Another major application of Storm is real-time data processing where data are not necessary to be written into storage devices, which usually results in little time delay. Hence, Storm is particularly suitable for scenarios where real-time online analysis is required to obtain results for big data analysis.\n<\/p><p>An application example is the geological big data aggregation mining framework based on Hadoop.<sup id=\"rdp-ebb-cite_ref-ZhuAFrame15_16-1\" class=\"reference\"><a href=\"#cite_note-ZhuAFrame15-16\" rel=\"external_link\">[16]<\/a><\/sup> Geological big data aggregation mining platform research is based on the China geological survey data network, and it uses the Hadoop technology to improve and modify the existing platform, to make it suitable for big data applications, and to provide a platform for the pilot applications. The geological survey grid platform can be updated in three layers: the virtual layer, the computing layer, and the terminal application layer. The virtual layer represents the virtualization of computer resources based on Hadoop distributed file system (HDFS) virtualization technology, which is the foundation of cloud computing and cloud services. The computing layer mainly uses the MapReduce method to implement the analysis algorithms for geological big data. Currently, the geological big data technologies mainly use the block calculation strategy to achieve parallel analysis through the utilization of the characteristics of Hadoop, in an effort to speed up the analysis and processing of geological data. The terminal application layer is designed to display the results and receive user feedback to improve system availability.\n<\/p><p>MapReduce has been used to perform morphological correlation analysis, which involves the analysis of geochemical data processing and the study of the correlation between multielements. Figure 8 shows the pattern correlation between elements. It can be seen from Figure 8 that the elements of Mn, Co, and Be are similar in the distribution of morphology. Therefore, from a qualitative point-of-view, the correlation is relatively high. Moreover, after testing, the proposed prototype system is running three times more quickly than the existing common computing platform, showing that the geological big data is applicable to the Hadoop platform. (More applications of using MapReduce can be found in Giachetta's 2015 paper.<sup id=\"rdp-ebb-cite_ref-GiachettaAFrame_51-0\" class=\"reference\"><a href=\"#cite_note-GiachettaAFrame-51\" rel=\"external_link\">[51]<\/a><\/sup>)\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig8_ZhuSciProg2018_2018-2018.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"07ef9d8b3c2220b505b0977d8f05e9ff\"><img alt=\"Fig8 ZhuSciProg2018 2018-2018.png\" src=\"https:\/\/www.limswiki.org\/images\/9\/9d\/Fig8_ZhuSciProg2018_2018-2018.png\" width=\"350\" height=\"154\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 8.<\/b> Correlation among three element morphologies<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h3><span class=\"mw-headline\" id=\"Applications_of_geological_big_data_technologies\">Applications of geological big data technologies<\/span><\/h3>\n<h4><span class=\"mw-headline\" id=\"Exploration_of_metallogenic_law\">Exploration of metallogenic law<\/span><\/h4>\n<p>The metallogenic law is the human regular knowledge of the temporal and spatial distribution of mineral resources, and its cognitive level, ability, and scope are all related to the size of data, the type of data, and the way of data processing. Therefore, to deduce the metallogenic law, it is necessary to fully understand the massive data surrounding spatial distribution, reserves and production in mineral origin, the geological structure of the mineral origin, and related geological survey data. As such, it's important to conduct the regular speculation and objectivity expression of these geological big data so that one can identify the essential reasons for the distribution of mineral origin. Using geological big data technologies could help to translate data into new understanding or knowledge and help to guide the future of geological prospecting work.\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Smart_prospecting\">Smart prospecting<\/span><\/h4>\n<p>Deposit types vary, and their formation is related to certain geological backgrounds and geological effects, respectively. The geological backgrounds include tectonic unit and stratigraphic unit, deep upper mantle and lithosphere conditions, and paleogeography and palaeoclimate environment on the surface of the earth. Geological effects include tectonism, magmatism, sedimentation, metamorphism, and weathering. These geologic backgrounds and effects, in the wide range of space and in the long geologic history, are a dynamic change and repeated stack, and large deposits can be formed only in a variety of favorable conditions. Long-term scientific research and experience is required to make sense of the accumulation of formed mineral deposits and associated mineralization predictions. Professionals are guided by certain theories and methods to adopt quantitative or qualitative prospecting methods to predict with the existing knowledge and experience.\n<\/p><p>However, in view of the difficulties of geological data sharing and the limitations of calculation tools and calculation methods, most of the known deposits in the past are independent of each other. In the future, we can use geological data to connect the exploration data of several adjacent deposits, conduct unified analysis and specialized processing, determine the \u201cdigital\u201d characteristics of the distribution of metallogenic materials, determine metallogenic potential, delineate the abnormal area and prospective area, and promote geological prospecting. Furthermore, geological data informatization and standardization could be improved.<sup id=\"rdp-ebb-cite_ref-HuangGeo16_52-0\" class=\"reference\"><a href=\"#cite_note-HuangGeo16-52\" rel=\"external_link\">[52]<\/a><\/sup>\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Service_of_people.E2.80.99s_livelihood_geology\">Service of people\u2019s livelihood geology<\/span><\/h4>\n<p>Since entering the twenty-first century, geological work has become mroe closely related to economic development, and geological work plays an important role in every aspect of social and economic life. Agricultural geology, urban geology, environmental geology, tourism geology, disaster geology, and other works have been strengthened, and the service area has also been expanded.<sup id=\"rdp-ebb-cite_ref-KouameTheStren17_53-0\" class=\"reference\"><a href=\"#cite_note-KouameTheStren17-53\" rel=\"external_link\">[53]<\/a><\/sup> At the same time, the public demand for geological information is increasingly urgent.<sup id=\"rdp-ebb-cite_ref-KarlssonLife17_54-0\" class=\"reference\"><a href=\"#cite_note-KarlssonLife17-54\" rel=\"external_link\">[54]<\/a><\/sup>\n<\/p><p>In order to meet the social demand for geological data, the China Geological Survey carried out the construction of a geological cloud, which built a cluster geologic data service system with the National Geological Information Center and the Provincial Geological Information Center as the backbone nodes, conducted the integration of data resources, and applied the GIS cloud technology, all in order to obtain large-scale computing ability and solve key problems such as the distributed storage, processing, query, interoperability, and virtualization of massive spatial data.<sup id=\"rdp-ebb-cite_ref-ChenTheCons15_5-4\" class=\"reference\"><a href=\"#cite_note-ChenTheCons15-5\" rel=\"external_link\">[5]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-TanArchi16_13-1\" class=\"reference\"><a href=\"#cite_note-TanArchi16-13\" rel=\"external_link\">[13]<\/a><\/sup> Recently, in China, Shandong Provincial Bureau of Geology and Mineral Resources also carried out the construction of \u201cthe application system of geological business based on e-government cloud platform.\u201d It mainly relies on the public service cloud platform of the e-government in Shandong province and constructs the government external network service system and internet service system to achieve the unified management and information service of the mineral resource. Using technical methods of spatial analysis, big data mining, and three-dimensional geological modelling, it develops a basic systems framework for geological mining services, featuring \u201ca (cloud) platform, a (data) center, and many application systems\u201d to improve the ability of the people\u2019s geological service, promote interaction with the public, realize socialization services, and promote the clustering and industrialization of the mineral resources information services.\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Application_of_knowledge_visualization_service\">Application of knowledge visualization service<\/span><\/h4>\n<p>With the continuous development of web technology, human beings have experienced the \u201cWeb 1.0\u201d era, which was characterized by document interconnection, and \u201cWeb 2.0\u201d, which was characterized by data interconnection, and are moving towards the new \u201cWeb 3.0\u201d era based on the interconnected knowledge of the entity. Due to the continuous release of user-generated content and linking open data on the internet, people need to explore knowledge interconnection methods which both conform to the development of the network information resources and meet users\u2019 requirements from a new perspective according to the knowledge organization principles in the large data environment, revealing human cognition on a deeper level.<sup id=\"rdp-ebb-cite_ref-StockToOnt12_55-0\" class=\"reference\"><a href=\"#cite_note-StockToOnt12-55\" rel=\"external_link\">[55]<\/a><\/sup>\n<\/p><p>In this context, Knowledge Graph (KG) was formally put forward by Google in May 2012, and its goal was to improve the search results and describe the various entities and concepts that exist in the real world and the relationship between these entities and concepts. KG is a great choice to select the essence and discard the dross, as well as the sublimation of the present semantic web technology. In recent years, the applications of KG have been increasing rapidly, and there is now a mature method used to draw a KG and conduct intelligent academic research based on KG.<sup id=\"rdp-ebb-cite_ref-ZhuInt17_34-1\" class=\"reference\"><a href=\"#cite_note-ZhuInt17-34\" rel=\"external_link\">[34]<\/a><\/sup> However, the function of KG has not been fully implemented at present, especially for geological big data; the application aspect still needs to be further strengthened. Along this direction, the visualization service for geological data in the web-based system is attracting more and more attention.<sup id=\"rdp-ebb-cite_ref-HunterAWeb16_56-0\" class=\"reference\"><a href=\"#cite_note-HunterAWeb16-56\" rel=\"external_link\">[56]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-M.C3.BCllerTheGPlates16_57-0\" class=\"reference\"><a href=\"#cite_note-M.C3.BCllerTheGPlates16-57\" rel=\"external_link\">[57]<\/a><\/sup>\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Conclusion\">Conclusion<\/span><\/h2>\n<p>Big data technologies make it possible to process massive amounts of unstructured and semistructured geological data. And the geological cloud enables us to explore the application of demand-driven geological core data and to extract new information from unstructured data, while supporting the decision-making in land resources management. Thus, the geological cloud could effectively organize and use geological big data, to mine the data scientifically, with the purpose of producing higher value and achieving the corresponding service.\n<\/p><p>In the architecture of the geological cloud, this article describes the application background of CEGIS and the demands from big data management. Furthermore, we elaborate on the application requirements and challenges faced in big data management technologies. Then, more analyses are provided from four aspects, including data size, data type, data processing speed, and data processing accuracy, respectively. This article also outlines the research status and technology development opportunities of big data related in CEGIS, from the perspectives of big data acquisition and preprocessing, big data storage and management, big data analysis and mining, highly performable big data cloud computing platform, and big data technology applications. With the continuous development of big data technologies in addressing those challenges related to geological big data, such as the difficulties of describing and modeling geological big data with some complex characteristics, CEGIS will move towards a more mature and more intelligent direction in the future.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Conflicts_of_interest\">Conflicts of interest<\/span><\/h2>\n<p>The authors declare that there are no conflicts of interest regarding the publication of this article.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Acknowledgments\">Acknowledgments<\/span><\/h2>\n<p>This work was supported in part by the Key Laboratory of Geological Information Technology of Ministry of Land and Resources under Grant 2017320, the National Key Technologies R&D Program of China under Grant 2015BAK38B01, and the National Key R&D Program of China under Grant 2016YFC0600510.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"References\">References<\/span><\/h2>\n<div class=\"reflist references-column-width\" style=\"-moz-column-width: 30em; -webkit-column-width: 30em; column-width: 30em; list-style-type: decimal;\">\n<ol class=\"references\">\n<li id=\"cite_note-VermeeschMaking15-1\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-VermeeschMaking15_1-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Vermeesch, P.; Garzenti, E. (2015). \"Making geological sense of \u2018Big Data\u2019 in sedimentary provenance analysis\". <i>Chemical Geology<\/i> <b>409<\/b>: 20-27. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.chemgeo.2015.05.004\" target=\"_blank\">10.1016\/j.chemgeo.2015.05.004<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Making+geological+sense+of+%E2%80%98Big+Data%E2%80%99+in+sedimentary+provenance+analysis&rft.jtitle=Chemical+Geology&rft.aulast=Vermeesch%2C+P.%3B+Garzenti%2C+E.&rft.au=Vermeesch%2C+P.%3B+Garzenti%2C+E.&rft.date=2015&rft.volume=409&rft.pages=20-27&rft_id=info:doi\/10.1016%2Fj.chemgeo.2015.05.004&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ChenQuant16-2\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-ChenQuant16_2-0\" rel=\"external_link\">2.0<\/a><\/sup> <sup><a href=\"#cite_ref-ChenQuant16_2-1\" rel=\"external_link\">2.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Chen, J.; Xiang, J.; Hu, Q. et al. (2016). \"Quantitative Geoscience and Geological Big Data Development: A Review\". <i>Acta Geologica Sinica<\/i> <b>90<\/b> (4): 1490\u20131515. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1111%2F1755-6724.12782\" target=\"_blank\">10.1111\/1755-6724.12782<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Quantitative+Geoscience+and+Geological+Big+Data+Development%3A+A+Review&rft.jtitle=Acta+Geologica+Sinica&rft.aulast=Chen%2C+J.%3B+Xiang%2C+J.%3B+Hu%2C+Q.+et+al.&rft.au=Chen%2C+J.%3B+Xiang%2C+J.%3B+Hu%2C+Q.+et+al.&rft.date=2016&rft.volume=90&rft.issue=4&rft.pages=1490%E2%80%931515&rft_id=info:doi\/10.1111%2F1755-6724.12782&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ZhuCyber16-3\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-ZhuCyber16_3-0\" rel=\"external_link\">3.0<\/a><\/sup> <sup><a href=\"#cite_ref-ZhuCyber16_3-1\" rel=\"external_link\">3.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Zhu, Y.; Tan, Y.; Li, R. et al. (2016). \"Cyber-physical-social-thinking modeling and computing for geological information service system\". <i>International Journal of Distributed Sensor Networks<\/i> <b>12<\/b> (11). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1177%2F1550147716666666\" target=\"_blank\">10.1177\/1550147716666666<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Cyber-physical-social-thinking+modeling+and+computing+for+geological+information+service+system&rft.jtitle=International+Journal+of+Distributed+Sensor+Networks&rft.aulast=Zhu%2C+Y.%3B+Tan%2C+Y.%3B+Li%2C+R.+et+al.&rft.au=Zhu%2C+Y.%3B+Tan%2C+Y.%3B+Li%2C+R.+et+al.&rft.date=2016&rft.volume=12&rft.issue=11&rft_id=info:doi\/10.1177%2F1550147716666666&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-KimCloud13-4\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-KimCloud13_4-0\" rel=\"external_link\">4.0<\/a><\/sup> <sup><a href=\"#cite_ref-KimCloud13_4-1\" rel=\"external_link\">4.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Kim, Y.-H.; Yarlagadda, P. (2013). \"Cloud Computing Model for Big Geological Data Processing\". <i>Applied Mechanics and Materials<\/i> <b>475\u2013476<\/b>: 306-311. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.4028%2Fwww.scientific.net%2FAMM.475-476.306\" target=\"_blank\">10.4028\/www.scientific.net\/AMM.475-476.306<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Cloud+Computing+Model+for+Big+Geological+Data+Processing&rft.jtitle=Applied+Mechanics+and+Materials&rft.aulast=Kim%2C+Y.-H.%3B+Yarlagadda%2C+P.&rft.au=Kim%2C+Y.-H.%3B+Yarlagadda%2C+P.&rft.date=2013&rft.volume=475%E2%80%93476&rft.pages=306-311&rft_id=info:doi\/10.4028%2Fwww.scientific.net%2FAMM.475-476.306&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ChenTheCons15-5\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-ChenTheCons15_5-0\" rel=\"external_link\">5.0<\/a><\/sup> <sup><a href=\"#cite_ref-ChenTheCons15_5-1\" rel=\"external_link\">5.1<\/a><\/sup> <sup><a href=\"#cite_ref-ChenTheCons15_5-2\" rel=\"external_link\">5.2<\/a><\/sup> <sup><a href=\"#cite_ref-ChenTheCons15_5-3\" rel=\"external_link\">5.3<\/a><\/sup> <sup><a href=\"#cite_ref-ChenTheCons15_5-4\" rel=\"external_link\">5.4<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Chen, J.; Li, J.; Cui, N.; Yu, P. (2015). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/caod.oriprobe.com\/articles\/46629977\/The_construction_and_application_of_geological_cloud_under_the_big_dat.htm\" target=\"_blank\">\"The construction and application of geological cloud under the big data background\"<\/a>. <i>Geological Bulletin of China<\/i> <b>34<\/b> (7): 1260\u20131265<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/caod.oriprobe.com\/articles\/46629977\/The_construction_and_application_of_geological_cloud_under_the_big_dat.htm\" target=\"_blank\">http:\/\/caod.oriprobe.com\/articles\/46629977\/The_construction_and_application_of_geological_cloud_under_the_big_dat.htm<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+construction+and+application+of+geological+cloud+under+the+big+data+background&rft.jtitle=Geological+Bulletin+of+China&rft.aulast=Chen%2C+J.%3B+Li%2C+J.%3B+Cui%2C+N.%3B+Yu%2C+P.&rft.au=Chen%2C+J.%3B+Li%2C+J.%3B+Cui%2C+N.%3B+Yu%2C+P.&rft.date=2015&rft.volume=34&rft.issue=7&rft.pages=1260%E2%80%931265&rft_id=http%3A%2F%2Fcaod.oriprobe.com%2Farticles%2F46629977%2FThe_construction_and_application_of_geological_cloud_under_the_big_dat.htm&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LiTheTech10-6\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-LiTheTech10_6-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Li, C. (2010). [10.1109\/GEOINFORMATICS.2010.5567743 \"The technical infrastructure of geological survey information grid\"]. <i>Proceedings from the 18th International Conference on Geoinformatics<\/i> <b>2010<\/b>: 1\u20136<span class=\"printonly\">. 10.1109\/GEOINFORMATICS.2010.5567743<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+technical+infrastructure+of+geological+survey+information+grid&rft.jtitle=Proceedings+from+the+18th+International+Conference+on+Geoinformatics&rft.aulast=Li%2C+C.&rft.au=Li%2C+C.&rft.date=2010&rft.volume=2010&rft.pages=1%E2%80%936&rft_id=10.1109%2FGEOINFORMATICS.2010.5567743&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WuAGeo15-7\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WuAGeo15_7-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Wu, L.; Xue, L.; Li, C. et al. (2015). \"A Geospatial Information Grid Framework for Geological Survey\". <i>PLoS One<\/i> <b>10<\/b> (12): e0145312. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1371%2Fjournal.pone.0145312\" target=\"_blank\">10.1371\/journal.pone.0145312<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Geospatial+Information+Grid+Framework+for+Geological+Survey&rft.jtitle=PLoS+One&rft.aulast=Wu%2C+L.%3B+Xue%2C+L.%3B+Li%2C+C.+et+al.&rft.au=Wu%2C+L.%3B+Xue%2C+L.%3B+Li%2C+C.+et+al.&rft.date=2015&rft.volume=10&rft.issue=12&rft.pages=e0145312&rft_id=info:doi\/10.1371%2Fjournal.pone.0145312&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-EvangelidisGeo14-8\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-EvangelidisGeo14_8-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Evangelidis, K.; Ntouros, K.; Makridis, S.; et al. (2014). \"Geospatial services in the Cloud\". <i>Computers & Geosciences<\/i> <b>63<\/b>: 116\u2013122. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.cageo.2013.10.007\" target=\"_blank\">10.1016\/j.cageo.2013.10.007<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Geospatial+services+in+the+Cloud&rft.jtitle=Computers+%26+Geosciences&rft.aulast=Evangelidis%2C+K.%3B+Ntouros%2C+K.%3B+Makridis%2C+S.%3B+et+al.&rft.au=Evangelidis%2C+K.%3B+Ntouros%2C+K.%3B+Makridis%2C+S.%3B+et+al.&rft.date=2014&rft.volume=63&rft.pages=116%E2%80%93122&rft_id=info:doi\/10.1016%2Fj.cageo.2013.10.007&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HuangGreen18-9\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HuangGreen18_9-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Huang, M.; Liu, A.; Wang, T.; Huang, C. (2017). <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.hindawi.com\/journals\/wcmc\/aip\/9715428\/\" target=\"_blank\">\"Green data gathering under delay differentiated services constraint for internet of things\"<\/a>. <i>Wireless Communications and Mobile Computing<\/i><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.hindawi.com\/journals\/wcmc\/aip\/9715428\/\" target=\"_blank\">https:\/\/www.hindawi.com\/journals\/wcmc\/aip\/9715428\/<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Green+data+gathering+under+delay+differentiated+services+constraint+for+internet+of+things&rft.jtitle=Wireless+Communications+and+Mobile+Computing&rft.aulast=Huang%2C+M.%3B+Liu%2C+A.%3B+Wang%2C+T.%3B+Huang%2C+C.&rft.au=Huang%2C+M.%3B+Liu%2C+A.%3B+Wang%2C+T.%3B+Huang%2C+C.&rft.date=2017&rft_id=https%3A%2F%2Fwww.hindawi.com%2Fjournals%2Fwcmc%2Faip%2F9715428%2F&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WoS-10\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WoS_10-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.webofknowledge.com\/\" target=\"_blank\">\"Web of Science\"<\/a>. Clarivate Analytics<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.webofknowledge.com\/\" target=\"_blank\">https:\/\/www.webofknowledge.com\/<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Web+of+Science&rft.atitle=&rft.pub=Clarivate+Analytics&rft_id=https%3A%2F%2Fwww.webofknowledge.com%2F&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-YangUtilizing17-11\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-YangUtilizing17_11-0\" rel=\"external_link\">11.0<\/a><\/sup> <sup><a href=\"#cite_ref-YangUtilizing17_11-1\" rel=\"external_link\">11.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Yang, C.; Yu, M.; Hu, F. et al. (2017). \"Utilizing cloud computing to address big geospatial data challenges\". <i>Computers, Environment and Urban Systems<\/i> <b>61<\/b> (Part B): 120\u2013128. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.compenvurbsys.2016.10.010\" target=\"_blank\">10.1016\/j.compenvurbsys.2016.10.010<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Utilizing+cloud+computing+to+address+big+geospatial+data+challenges&rft.jtitle=Computers%2C+Environment+and+Urban+Systems&rft.aulast=Yang%2C+C.%3B+Yu%2C+M.%3B+Hu%2C+F.+et+al.&rft.au=Yang%2C+C.%3B+Yu%2C+M.%3B+Hu%2C+F.+et+al.&rft.date=2017&rft.volume=61&rft.issue=Part+B&rft.pages=120%E2%80%93128&rft_id=info:doi\/10.1016%2Fj.compenvurbsys.2016.10.010&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WuAKnow17-12\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WuAKnow17_12-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Wu, L.; Xue, L.; Li, C. et al. (2017). \"A Knowledge-Driven Geospatially Enabled Framework for Geological Big Data\". <i>International Journal of Geo-Information<\/i> <b>6<\/b> (6): 166. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.3390%2Fijgi6060166\" target=\"_blank\">10.3390\/ijgi6060166<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Knowledge-Driven+Geospatially+Enabled+Framework+for+Geological+Big+Data&rft.jtitle=International+Journal+of+Geo-Information&rft.aulast=Wu%2C+L.%3B+Xue%2C+L.%3B+Li%2C+C.+et+al.&rft.au=Wu%2C+L.%3B+Xue%2C+L.%3B+Li%2C+C.+et+al.&rft.date=2017&rft.volume=6&rft.issue=6&rft.pages=166&rft_id=info:doi\/10.3390%2Fijgi6060166&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-TanArchi16-13\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-TanArchi16_13-0\" rel=\"external_link\">13.0<\/a><\/sup> <sup><a href=\"#cite_ref-TanArchi16_13-1\" rel=\"external_link\">13.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Tan, Y. (2016). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/caod.oriprobe.com\/articles\/48928882\/Architecture_and_Key_Issues_of_Geological_Big_Data_and_Information_Ser.htm\" target=\"_blank\">\"Architecture and Key Issues of Geological Big Data and Information Service Project\"<\/a>. <i>Geomatics World<\/i> <b>23<\/b> (1): 1\u20136<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/caod.oriprobe.com\/articles\/48928882\/Architecture_and_Key_Issues_of_Geological_Big_Data_and_Information_Ser.htm\" target=\"_blank\">http:\/\/caod.oriprobe.com\/articles\/48928882\/Architecture_and_Key_Issues_of_Geological_Big_Data_and_Information_Ser.htm<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Architecture+and+Key+Issues+of+Geological+Big+Data+and+Information+Service+Project&rft.jtitle=Geomatics+World&rft.aulast=Tan%2C+Y.&rft.au=Tan%2C+Y.&rft.date=2016&rft.volume=23&rft.issue=1&rft.pages=1%E2%80%936&rft_id=http%3A%2F%2Fcaod.oriprobe.com%2Farticles%2F48928882%2FArchitecture_and_Key_Issues_of_Geological_Big_Data_and_Information_Ser.htm&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-TanArchiInvest16-14\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-TanArchiInvest16_14-0\" rel=\"external_link\">14.0<\/a><\/sup> <sup><a href=\"#cite_ref-TanArchiInvest16_14-1\" rel=\"external_link\">14.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Tan, Y. (2016). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.zgdzdcbjb.com\/EN\/abstract\/abstract160.shtml\" target=\"_blank\">\"Architecture investigation of the construction of geological big data system\"<\/a>. <i>Geological Survey of China<\/i> <b>3<\/b> (3): 1\u20136<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.zgdzdcbjb.com\/EN\/abstract\/abstract160.shtml\" target=\"_blank\">http:\/\/www.zgdzdcbjb.com\/EN\/abstract\/abstract160.shtml<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Architecture+investigation+of+the+construction+of+geological+big+data+system&rft.jtitle=Geological+Survey+of+China&rft.aulast=Tan%2C+Y.&rft.au=Tan%2C+Y.&rft.date=2016&rft.volume=3&rft.issue=3&rft.pages=1%E2%80%936&rft_id=http%3A%2F%2Fwww.zgdzdcbjb.com%2FEN%2Fabstract%2Fabstract160.shtml&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HeProto14-15\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HeProto14_15-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">He, W.; Wang, Y. (2014). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/caod.oriprobe.com\/articles\/45636829\/Prototype_system_of_geological_cloud_computing.htm\" target=\"_blank\">\"Prototype system of geological cloud computing\"<\/a>. <i>Progress in Geophysics<\/i> <b>29<\/b> (6): 2886\u20132896<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/caod.oriprobe.com\/articles\/45636829\/Prototype_system_of_geological_cloud_computing.htm\" target=\"_blank\">http:\/\/caod.oriprobe.com\/articles\/45636829\/Prototype_system_of_geological_cloud_computing.htm<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Prototype+system+of+geological+cloud+computing&rft.jtitle=Progress+in+Geophysics&rft.aulast=He%2C+W.%3B+Wang%2C+Y.&rft.au=He%2C+W.%3B+Wang%2C+Y.&rft.date=2014&rft.volume=29&rft.issue=6&rft.pages=2886%E2%80%932896&rft_id=http%3A%2F%2Fcaod.oriprobe.com%2Farticles%2F45636829%2FPrototype_system_of_geological_cloud_computing.htm&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ZhuAFrame15-16\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-ZhuAFrame15_16-0\" rel=\"external_link\">16.0<\/a><\/sup> <sup><a href=\"#cite_ref-ZhuAFrame15_16-1\" rel=\"external_link\">16.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Zhu, Y.; Tan, T.; Zhang, J. et al. (2015). \"A framework of hadoop based geology big data fusion and mining technologies\". <i>Cehui Xuebao\/Acta Geodaetica et Cartographica Sinica<\/i> <b>44<\/b> (S0): 152\u2013159. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.11947%2Fj.AGCS.2015.F059\" target=\"_blank\">10.11947\/j.AGCS.2015.F059<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+framework+of+hadoop+based+geology+big+data+fusion+and+mining+technologies&rft.jtitle=Cehui+Xuebao%2FActa+Geodaetica+et+Cartographica+Sinica&rft.aulast=Zhu%2C+Y.%3B+Tan%2C+T.%3B+Zhang%2C+J.+et+al.&rft.au=Zhu%2C+Y.%3B+Tan%2C+T.%3B+Zhang%2C+J.+et+al.&rft.date=2015&rft.volume=44&rft.issue=S0&rft.pages=152%E2%80%93159&rft_id=info:doi\/10.11947%2Fj.AGCS.2015.F059&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WangChar16-17\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-WangChar16_17-0\" rel=\"external_link\">17.0<\/a><\/sup> <sup><a href=\"#cite_ref-WangChar16_17-1\" rel=\"external_link\">17.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Wang, D.; Liu, X.; Liu, L. (2015). \"Characteristics of big geodata and its application to study of minerogenetic regularity and minerogenetic series\". <i>Mineral Deposits<\/i> <b>34<\/b> (6): 1143\u20131154. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.16111%2Fj.0258-7106.2015.06.004\" target=\"_blank\">10.16111\/j.0258-7106.2015.06.004<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Characteristics+of+big+geodata+and+its+application+to+study+of+minerogenetic+regularity+and+minerogenetic+series&rft.jtitle=Mineral+Deposits&rft.aulast=Wang%2C+D.%3B+Liu%2C+X.%3B+Liu%2C+L.&rft.au=Wang%2C+D.%3B+Liu%2C+X.%3B+Liu%2C+L.&rft.date=2015&rft.volume=34&rft.issue=6&rft.pages=1143%E2%80%931154&rft_id=info:doi\/10.16111%2Fj.0258-7106.2015.06.004&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-PanMan17-18\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-PanMan17_18-0\" rel=\"external_link\">18.0<\/a><\/sup> <sup><a href=\"#cite_ref-PanMan17_18-1\" rel=\"external_link\">18.1<\/a><\/sup> <sup><a href=\"#cite_ref-PanMan17_18-2\" rel=\"external_link\">18.2<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Pan, B.; Yang, R. (2017). <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/caod.oriprobe.com\/articles\/50925192\/Management_and_Utilization_of_Big_Data_for_Geology.htm\" target=\"_blank\">\"Management and Utilization of Big Data for Geology\"<\/a>. <i>Surveying and Mapping of Geology and Mineral Resources<\/i> <b>33<\/b> (1): 1\u20133, 14<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/caod.oriprobe.com\/articles\/50925192\/Management_and_Utilization_of_Big_Data_for_Geology.htm\" target=\"_blank\">https:\/\/caod.oriprobe.com\/articles\/50925192\/Management_and_Utilization_of_Big_Data_for_Geology.htm<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Management+and+Utilization+of+Big+Data+for+Geology&rft.jtitle=Surveying+and+Mapping+of+Geology+and+Mineral+Resources&rft.aulast=Pan%2C+B.%3B+Yang%2C+R.&rft.au=Pan%2C+B.%3B+Yang%2C+R.&rft.date=2017&rft.volume=33&rft.issue=1&rft.pages=1%E2%80%933%2C+14&rft_id=https%3A%2F%2Fcaod.oriprobe.com%2Farticles%2F50925192%2FManagement_and_Utilization_of_Big_Data_for_Geology.htm&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-YangTheRes14-19\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-YangTheRes14_19-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Yang, P.; Lu, L.J. (2014). \"The Research on Encoding Methodology of the Character of Geological Entity Based on Mass Geological Data\". <i>Advanced Materials Research<\/i> <b>962-965<\/b>: 208\u2013212. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.4028%2Fwww.scientific.net%2FAMR.962-965.208\" target=\"_blank\">10.4028\/www.scientific.net\/AMR.962-965.208<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+Research+on+Encoding+Methodology+of+the+Character+of+Geological+Entity+Based+on+Mass+Geological+Data&rft.jtitle=Advanced+Materials+Research&rft.aulast=Yang%2C+P.%3B+Lu%2C+L.J.&rft.au=Yang%2C+P.%3B+Lu%2C+L.J.&rft.date=2014&rft.volume=962-965&rft.pages=208%E2%80%93212&rft_id=info:doi\/10.4028%2Fwww.scientific.net%2FAMR.962-965.208&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LuoAKern16-20\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-LuoAKern16_20-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Luo, X.; Zhang, D.; Yang, L.T. et al. (2016). \"A kernel machine-based secure data sensing and fusion scheme in wireless sensor networks for the cyber-physical systems\". <i>Future Generation Computer Systems<\/i> <b>61<\/b>: 85\u201396. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.future.2015.10.022\" target=\"_blank\">10.1016\/j.future.2015.10.022<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+kernel+machine-based+secure+data+sensing+and+fusion+scheme+in+wireless+sensor+networks+for+the+cyber-physical+systems&rft.jtitle=Future+Generation+Computer+Systems&rft.aulast=Luo%2C+X.%3B+Zhang%2C+D.%3B+Yang%2C+L.T.+et+al.&rft.au=Luo%2C+X.%3B+Zhang%2C+D.%3B+Yang%2C+L.T.+et+al.&rft.date=2016&rft.volume=61&rft.pages=85%E2%80%9396&rft_id=info:doi\/10.1016%2Fj.future.2015.10.022&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-KuoInter16-21\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-KuoInter16_21-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Kuo, C.-L.; Hong, J.-H. (2016). \"Interoperable cross-domain semantic and geospatial framework for automatic change detection\". <i>Computers & Geosciences<\/i> <b>86<\/b>: 109\u2013119. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.cageo.2015.10.011\" target=\"_blank\">10.1016\/j.cageo.2015.10.011<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Interoperable+cross-domain+semantic+and+geospatial+framework+for+automatic+change+detection&rft.jtitle=Computers+%26+Geosciences&rft.aulast=Kuo%2C+C.-L.%3B+Hong%2C+J.-H.&rft.au=Kuo%2C+C.-L.%3B+Hong%2C+J.-H.&rft.date=2016&rft.volume=86&rft.pages=109%E2%80%93119&rft_id=info:doi\/10.1016%2Fj.cageo.2015.10.011&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WangKey11-22\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WangKey11_22-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Wang, Y.-J.; Sun, W.-D.; Zhou, S. et al. (2011). \"Key Technologies of Distributed Storage for Cloud Computing\". <i>Journal of Software<\/i> <b>23<\/b> (4): 962-986. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.3724%2FSP.J.1001.2012.04175\" target=\"_blank\">10.3724\/SP.J.1001.2012.04175<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Key+Technologies+of+Distributed+Storage+for+Cloud+Computing&rft.jtitle=Journal+of+Software&rft.aulast=Wang%2C+Y.-J.%3B+Sun%2C+W.-D.%3B+Zhou%2C+S.+et+al.&rft.au=Wang%2C+Y.-J.%3B+Sun%2C+W.-D.%3B+Zhou%2C+S.+et+al.&rft.date=2011&rft.volume=23&rft.issue=4&rft.pages=962-986&rft_id=info:doi\/10.3724%2FSP.J.1001.2012.04175&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ArmbrustAbove09-23\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-ArmbrustAbove09_23-0\" rel=\"external_link\">23.0<\/a><\/sup> <sup><a href=\"#cite_ref-ArmbrustAbove09_23-1\" rel=\"external_link\">23.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation web\">Armbrust, M.; Fox, A.; Griffith, R. et al. (10 February 2009). <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www2.eecs.berkeley.edu\/Pubs\/TechRpts\/2009\/EECS-2009-28.pdf\" target=\"_blank\">\"Above the Clouds: A Berkeley View of Cloud Computing\"<\/a> (PDF). University of California at Berkeley<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www2.eecs.berkeley.edu\/Pubs\/TechRpts\/2009\/EECS-2009-28.pdf\" target=\"_blank\">https:\/\/www2.eecs.berkeley.edu\/Pubs\/TechRpts\/2009\/EECS-2009-28.pdf<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Above+the+Clouds%3A+A+Berkeley+View+of+Cloud+Computing&rft.atitle=&rft.aulast=Armbrust%2C+M.%3B+Fox%2C+A.%3B+Griffith%2C+R.+et+al.&rft.au=Armbrust%2C+M.%3B+Fox%2C+A.%3B+Griffith%2C+R.+et+al.&rft.date=10+February+2009&rft.pub=University+of+California+at+Berkeley&rft_id=https%3A%2F%2Fwww2.eecs.berkeley.edu%2FPubs%2FTechRpts%2F2009%2FEECS-2009-28.pdf&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-XiaDesign14-24\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-XiaDesign14_24-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Xia, J.; Bai, Z.; Wang, B. et al. (2014). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/xbna.pku.edu.cn\/EN\/abstract\/abstract2677.shtml\" target=\"_blank\">\"Design and Implementation of Comprehensive Management Platform for Geological Data Informatization\"<\/a>. <i>Acta Scientiarum Naturalium Universitatis Pekinensis<\/i> <b>50<\/b> (2): 295-300<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/xbna.pku.edu.cn\/EN\/abstract\/abstract2677.shtml\" target=\"_blank\">http:\/\/xbna.pku.edu.cn\/EN\/abstract\/abstract2677.shtml<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Design+and+Implementation+of+Comprehensive+Management+Platform+for+Geological+Data+Informatization&rft.jtitle=Acta+Scientiarum+Naturalium+Universitatis+Pekinensis&rft.aulast=Xia%2C+J.%3B+Bai%2C+Z.%3B+Wang%2C+B.+et+al.&rft.au=Xia%2C+J.%3B+Bai%2C+Z.%3B+Wang%2C+B.+et+al.&rft.date=2014&rft.volume=50&rft.issue=2&rft.pages=295-300&rft_id=http%3A%2F%2Fxbna.pku.edu.cn%2FEN%2Fabstract%2Fabstract2677.shtml&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HuaData15-25\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HuaData15_25-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Hua, W.; Liu, J.; Liu, X. (2015). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.cnki.com.cn\/Article_en\/CJFDTotal-DQKX201503004.htm\" target=\"_blank\">\"Data Management of Object Type Geological Features on Control Dictionary\"<\/a>. <i>Earth Science - Journal of China University of Geosciences<\/i> <b>40<\/b> (3): 425\u2013430<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/en.cnki.com.cn\/Article_en\/CJFDTotal-DQKX201503004.htm\" target=\"_blank\">http:\/\/en.cnki.com.cn\/Article_en\/CJFDTotal-DQKX201503004.htm<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Data+Management+of+Object+Type+Geological+Features+on+Control+Dictionary&rft.jtitle=Earth+Science+-+Journal+of+China+University+of+Geosciences&rft.aulast=Hua%2C+W.%3B+Liu%2C+J.%3B++Liu%2C+X.&rft.au=Hua%2C+W.%3B+Liu%2C+J.%3B++Liu%2C+X.&rft.date=2015&rft.volume=40&rft.issue=3&rft.pages=425%E2%80%93430&rft_id=http%3A%2F%2Fen.cnki.com.cn%2FArticle_en%2FCJFDTotal-DQKX201503004.htm&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-JiaDesign10-26\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-JiaDesign10_26-0\" rel=\"external_link\">26.0<\/a><\/sup> <sup><a href=\"#cite_ref-JiaDesign10_26-1\" rel=\"external_link\">26.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Jia, B.; Wang, C.; Liu, C. et al. (2010). \"Design and implementation of object-oriented spatial database of coalfield geological hazards-based on object-oriented data model\". <i>Proceedings from the 2010 International Conference on Computer Application and System Modeling<\/i> <b>2010<\/b>: V1282\u2013V1286. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FICCASM.2010.5619411\" target=\"_blank\">10.1109\/ICCASM.2010.5619411<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Design+and+implementation+of+object-oriented+spatial+database+of+coalfield+geological+hazards-based+on+object-oriented+data+model&rft.jtitle=Proceedings+from+the+2010+International+Conference+on+Computer+Application+and+System+Modeling&rft.aulast=Jia%2C+B.%3B+Wang%2C+C.%3B+Liu%2C+C.+et+al.&rft.au=Jia%2C+B.%3B+Wang%2C+C.%3B+Liu%2C+C.+et+al.&rft.date=2010&rft.volume=2010&rft.pages=V1282%E2%80%93V1286&rft_id=info:doi\/10.1109%2FICCASM.2010.5619411&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-AGIS-27\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-AGIS_27-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"http:\/\/enterprise.arcgis.com\/en\/\" target=\"_blank\">\"ArcGIS Enterprise\"<\/a>. Environmental Systems Research Institute, Inc<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/enterprise.arcgis.com\/en\/\" target=\"_blank\">http:\/\/enterprise.arcgis.com\/en\/<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=ArcGIS+Enterprise&rft.atitle=&rft.pub=Environmental+Systems+Research+Institute%2C+Inc&rft_id=http%3A%2F%2Fenterprise.arcgis.com%2Fen%2F&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ZhouDesign13-28\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ZhouDesign13_28-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Zhou, X.; Li, X.; Chen, A. et al. (2013). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.cnki.com.cn\/Article_en\/CJFDTOTAL-CHXG201304020.htm\" target=\"_blank\">\"Design and Implementation of the Service System of Spatial Data for Geological Data\"<\/a>. <i>Journal of Geomatics<\/i> <b>38<\/b> (4): 57\u201360<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/en.cnki.com.cn\/Article_en\/CJFDTOTAL-CHXG201304020.htm\" target=\"_blank\">http:\/\/en.cnki.com.cn\/Article_en\/CJFDTOTAL-CHXG201304020.htm<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Design+and+Implementation+of+the+Service+System+of+Spatial+Data+for+Geological+Data&rft.jtitle=Journal+of+Geomatics&rft.aulast=Zhou%2C+X.%3B+Li%2C+X.%3B+Chen%2C+A.+et+al.&rft.au=Zhou%2C+X.%3B+Li%2C+X.%3B+Chen%2C+A.+et+al.&rft.date=2013&rft.volume=38&rft.issue=4&rft.pages=57%E2%80%9360&rft_id=http%3A%2F%2Fen.cnki.com.cn%2FArticle_en%2FCJFDTOTAL-CHXG201304020.htm&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HuangOpp16-29\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HuangOpp16_29-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Huang, H.; Cao, Z.; Feng, C. (2016). \"Opportunities and challenges of big data intelligence analysis\". <i>CAAI Transactions on Intelligent Systems<\/i> <b>11<\/b> (6): 719-727. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.11992%2Ftis.201610025\" target=\"_blank\">10.11992\/tis.201610025<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Opportunities+and+challenges+of+big+data+intelligence+analysis&rft.jtitle=CAAI+Transactions+on+Intelligent+Systems&rft.aulast=Huang%2C+H.%3B+Cao%2C+Z.%3B+Feng%2C+C.&rft.au=Huang%2C+H.%3B+Cao%2C+Z.%3B+Feng%2C+C.&rft.date=2016&rft.volume=11&rft.issue=6&rft.pages=719-727&rft_id=info:doi\/10.11992%2Ftis.201610025&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-JinComm15-30\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-JinComm15_30-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Jin, S.; Lin, W.; Yin, H. et al. (2015). \"Community structure mining in big data social media networks with MapReduce\". <i>Cluster Computing<\/i> <b>18<\/b> (3): 999\u20131010. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1007%2Fs10586-015-0452-x\" target=\"_blank\">10.1007\/s10586-015-0452-x<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Community+structure+mining+in+big+data+social+media+networks+with+MapReduce&rft.jtitle=Cluster+Computing&rft.aulast=Jin%2C+S.%3B+Lin%2C+W.%3B+Yin%2C+H.+et+al.&rft.au=Jin%2C+S.%3B+Lin%2C+W.%3B+Yin%2C+H.+et+al.&rft.date=2015&rft.volume=18&rft.issue=3&rft.pages=999%E2%80%931010&rft_id=info:doi\/10.1007%2Fs10586-015-0452-x&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-YangMana11-31\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-YangMana11_31-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Yang, C.C.; Wei, C.-P.; Chien, L.-F. (2011). \"Managing and mining multilingual documents: Introduction to the special topic issue of information processing management\". <i>Information Processing & Management<\/i> <b>47<\/b> (5): 633-634. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.ipm.2010.02.002\" target=\"_blank\">10.1016\/j.ipm.2010.02.002<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Managing+and+mining+multilingual+documents%3A+Introduction+to+the+special+topic+issue+of+information+processing+management&rft.jtitle=Information+Processing+%26+Management&rft.aulast=Yang%2C+C.C.%3B+Wei%2C+C.-P.%3B+Chien%2C+L.-F.&rft.au=Yang%2C+C.C.%3B+Wei%2C+C.-P.%3B+Chien%2C+L.-F.&rft.date=2011&rft.volume=47&rft.issue=5&rft.pages=633-634&rft_id=info:doi\/10.1016%2Fj.ipm.2010.02.002&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LuoAQuant17-32\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-LuoAQuant17_32-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Luo, X.; Deng, J.; Wang, W. et al. (2017). \"A Quantized Kernel Learning Algorithm Using a Minimum Kernel Risk-Sensitive Loss Criterion and Bilateral Gradient Technique\". <i>Entropy<\/i> <b>19<\/b> (7): 365. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.3390%2Fe19070365\" target=\"_blank\">10.3390\/e19070365<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Quantized+Kernel+Learning+Algorithm+Using+a+Minimum+Kernel+Risk-Sensitive+Loss+Criterion+and+Bilateral+Gradient+Technique&rft.jtitle=Entropy&rft.aulast=Luo%2C+X.%3B+Deng%2C+J.%3B+Wang%2C+W.+et+al.&rft.au=Luo%2C+X.%3B+Deng%2C+J.%3B+Wang%2C+W.+et+al.&rft.date=2017&rft.volume=19&rft.issue=7&rft.pages=365&rft_id=info:doi\/10.3390%2Fe19070365&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-TseGeo15-33\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-TseGeo15_33-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Tse, C.H.; Li, Y.; Lam, E.Y. (2015). \"Geological applications of machine learning on hyperspectral remote sensing data\". <i>Proceedings Volume 9405: SPIE\/IS&T Electronic Imaging 2015<\/i> <b>9405<\/b> (2015). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1117%2F12.2178400\" target=\"_blank\">10.1117\/12.2178400<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Geological+applications+of+machine+learning+on+hyperspectral+remote+sensing+data&rft.jtitle=Proceedings+Volume+9405%3A+SPIE%2FIS%26T+Electronic+Imaging+2015&rft.aulast=Tse%2C+C.H.%3B+Li%2C+Y.%3B+Lam%2C+E.Y.&rft.au=Tse%2C+C.H.%3B+Li%2C+Y.%3B+Lam%2C+E.Y.&rft.date=2015&rft.volume=9405&rft.issue=2015&rft_id=info:doi\/10.1117%2F12.2178400&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ZhuInt17-34\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-ZhuInt17_34-0\" rel=\"external_link\">34.0<\/a><\/sup> <sup><a href=\"#cite_ref-ZhuInt17_34-1\" rel=\"external_link\">34.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Zhu, Y.; Zhou, W.; Xu, Y. et al. (2017). \"Intelligent Learning for Knowledge Graph towards Geological Data\". <i>Scientific Programming<\/i> <b>2017<\/b> (2017). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1155%2F2017%2F5072427\" target=\"_blank\">10.1155\/2017\/5072427<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Intelligent+Learning+for+Knowledge+Graph+towards+Geological+Data&rft.jtitle=Scientific+Programming&rft.aulast=Zhu%2C+Y.%3B+Zhou%2C+W.%3B+Xu%2C+Y.+et+al.&rft.au=Zhu%2C+Y.%3B+Zhou%2C+W.%3B+Xu%2C+Y.+et+al.&rft.date=2017&rft.volume=2017&rft.issue=2017&rft_id=info:doi\/10.1155%2F2017%2F5072427&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-VoData15-35\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-VoData15_35-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Gasmi, A.; Gomez, C.; Zouai, H. et al. (2015). \"PCA and SVM as geo-computational methods for geological mapping in the southern of Tunisia, using ASTER remote sensing data set\". <i>Arabian Journal of Geosciences<\/i> <b>19<\/b> (4): 747\u2013767. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1007%2Fs10596-015-9483-x\" target=\"_blank\">10.1007\/s10596-015-9483-x<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=PCA+and+SVM+as+geo-computational+methods+for+geological+mapping+in+the+southern+of+Tunisia%2C+using+ASTER+remote+sensing+data+set&rft.jtitle=Arabian+Journal+of+Geosciences&rft.aulast=Gasmi%2C+A.%3B+Gomez%2C+C.%3B+Zouai%2C+H.+et+al.&rft.au=Gasmi%2C+A.%3B+Gomez%2C+C.%3B+Zouai%2C+H.+et+al.&rft.date=2015&rft.volume=19&rft.issue=4&rft.pages=747%E2%80%93767&rft_id=info:doi\/10.1007%2Fs10596-015-9483-x&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GasmiPCA16-36\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-GasmiPCA16_36-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Vo, H.X.; Durlofsky, L.J. (2016). \"Data assimilation and uncertainty assessment for complex geological models using a new PCA-based parameterization\". <i>Computational Geosciences<\/i> <b>9<\/b>: 753. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1007%2Fs12517-016-2791-1\" target=\"_blank\">10.1007\/s12517-016-2791-1<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Data+assimilation+and+uncertainty+assessment+for+complex+geological+models+using+a+new+PCA-based+parameterization&rft.jtitle=Computational+Geosciences&rft.aulast=Vo%2C+H.X.%3B+Durlofsky%2C+L.J.&rft.au=Vo%2C+H.X.%3B+Durlofsky%2C+L.J.&rft.date=2016&rft.volume=9&rft.pages=753&rft_id=info:doi\/10.1007%2Fs12517-016-2791-1&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LuoResearch12-37\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-LuoResearch12_37-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Luo, Z.S.; Wei, Y.T. (2012). \"Research on Rough Set Applied in the Geological Measure Data Prediction Model\". <i>Advanced Materials Research<\/i> <b>457-458<\/b>: 792-798. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.4028%2Fwww.scientific.net%2FAMR.457-458.792\" target=\"_blank\">10.4028\/www.scientific.net\/AMR.457-458.792<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Research+on+Rough+Set+Applied+in+the+Geological+Measure+Data+Prediction+Model&rft.jtitle=Advanced+Materials+Research&rft.aulast=Luo%2C+Z.S.%3B+Wei%2C+Y.T.&rft.au=Luo%2C+Z.S.%3B+Wei%2C+Y.T.&rft.date=2012&rft.volume=457-458&rft.pages=792-798&rft_id=info:doi\/10.4028%2Fwww.scientific.net%2FAMR.457-458.792&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-FarzamianAWeight16-38\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-FarzamianAWeight16_38-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Farzamian, M.; Rouhani, A.K.; Yarmohammadi, A. et al. (2016). \"A weighted fuzzy aggregation GIS model in the integration of geophysical data with geochemical and geological data for Pb\u2013Zn exploration in Takab area, NW Iran\". <i>Arabian Journal of Geosciences<\/i> <b>9<\/b>: 104. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1007%2Fs12517-015-2202-z\" target=\"_blank\">10.1007\/s12517-015-2202-z<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+weighted+fuzzy+aggregation+GIS+model+in+the+integration+of+geophysical+data+with+geochemical+and+geological+data+for+Pb%E2%80%93Zn+exploration+in+Takab+area%2C+NW+Iran&rft.jtitle=Arabian+Journal+of+Geosciences&rft.aulast=Farzamian%2C+M.%3B+Rouhani%2C+A.K.%3B+Yarmohammadi%2C+A.+et+al.&rft.au=Farzamian%2C+M.%3B+Rouhani%2C+A.K.%3B+Yarmohammadi%2C+A.+et+al.&rft.date=2016&rft.volume=9&rft.pages=104&rft_id=info:doi\/10.1007%2Fs12517-015-2202-z&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-XuEffi17-39\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-XuEffi17_39-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Xu, Y.; Luo, X.; Wang, W. et al. (2017). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5298708\" target=\"_blank\">\"Efficient DV-HOP Localization for Wireless Cyber-Physical Social Sensing System: A Correntropy-Based Neural Network Learning Scheme\"<\/a>. <i>Sensors<\/i> <b>17<\/b> (1): E135. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.3390%2Fs17010135\" target=\"_blank\">10.3390\/s17010135<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC5298708\/\" target=\"_blank\">PMC5298708<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/28085084\" target=\"_blank\">28085084<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5298708\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC5298708<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Efficient+DV-HOP+Localization+for+Wireless+Cyber-Physical+Social+Sensing+System%3A+A+Correntropy-Based+Neural+Network+Learning+Scheme&rft.jtitle=Sensors&rft.aulast=Xu%2C+Y.%3B+Luo%2C+X.%3B+Wang%2C+W.+et+al.&rft.au=Xu%2C+Y.%3B+Luo%2C+X.%3B+Wang%2C+W.+et+al.&rft.date=2017&rft.volume=17&rft.issue=1&rft.pages=E135&rft_id=info:doi\/10.3390%2Fs17010135&rft_id=info:pmc\/PMC5298708&rft_id=info:pmid\/28085084&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC5298708&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LuoTowards17-40\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-LuoTowards17_40-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Luo, X.; Xu, Y.; Wang, W. et al. (2017). \"Towards enhancing stacked extreme learning machine with sparse autoencoder by correntropy\". <i>Journal of the Franklin Institute<\/i>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.jfranklin.2017.08.014\" target=\"_blank\">10.1016\/j.jfranklin.2017.08.014<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Towards+enhancing+stacked+extreme+learning+machine+with+sparse+autoencoder+by+correntropy&rft.jtitle=Journal+of+the+Franklin+Institute&rft.aulast=Luo%2C+X.%3B+Xu%2C+Y.%3B+Wang%2C+W.+et+al.&rft.au=Luo%2C+X.%3B+Xu%2C+Y.%3B+Wang%2C+W.+et+al.&rft.date=2017&rft_id=info:doi\/10.1016%2Fj.jfranklin.2017.08.014&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LuoOnline15-41\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-LuoOnline15_41-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Luo, X.; Luo, H.; Chang, X. (2015). \"Online Optimization of Collaborative Web Service QoS Prediction Based on Approximate Dynamic Programming\". <i>International Journal of Distributed Sensor Networks<\/i> <b>11<\/b> (8): 452492. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1155%2F2015%2F452492\" target=\"_blank\">10.1155\/2015\/452492<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Online+Optimization+of+Collaborative+Web+Service+QoS+Prediction+Based+on+Approximate+Dynamic+Programming&rft.jtitle=International+Journal+of+Distributed+Sensor+Networks&rft.aulast=Luo%2C+X.%3B+Luo%2C+H.%3B+Chang%2C+X.&rft.au=Luo%2C+X.%3B+Luo%2C+H.%3B+Chang%2C+X.&rft.date=2015&rft.volume=11&rft.issue=8&rft.pages=452492&rft_id=info:doi\/10.1155%2F2015%2F452492&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LuoAQuantKern17-42\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-LuoAQuantKern17_42-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Luo, X.; Deng, J.; Liu, J. et al. (2017). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.cic-chinacommunications.cn\/CN\/Y2017\/V14\/I7\/127\" target=\"_blank\">\"A Quantized Kernel Least Mean Square Scheme with Entropy-Guided Learning for Intelligent Data Analysis\"<\/a>. <i>China Communications<\/i> <b>14<\/b> (7): 127\u2013136<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.cic-chinacommunications.cn\/CN\/Y2017\/V14\/I7\/127\" target=\"_blank\">http:\/\/www.cic-chinacommunications.cn\/CN\/Y2017\/V14\/I7\/127<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Quantized+Kernel+Least+Mean+Square+Scheme+with+Entropy-Guided+Learning+for+Intelligent+Data+Analysis&rft.jtitle=China+Communications&rft.aulast=Luo%2C+X.%3B+Deng%2C+J.%3B+Liu%2C+J.+et+al.&rft.au=Luo%2C+X.%3B+Deng%2C+J.%3B+Liu%2C+J.+et+al.&rft.date=2017&rft.volume=14&rft.issue=7&rft.pages=127%E2%80%93136&rft_id=http%3A%2F%2Fwww.cic-chinacommunications.cn%2FCN%2FY2017%2FV14%2FI7%2F127&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-PassmoreEarth14-43\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-PassmoreEarth14_43-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Passmore, J.; Laxton. J.; Sen, M. (2014). \"EarthServer for Geological applications \u2013 Opening up access to big data using OGC web services\". In Toll, D.G.; Zhu, H.; Osman, A. et al.. <i>Information Technology in Geo-Engineering<\/i>. Advances in Soil Mechanics and Geotechnical Engineering. <b>3<\/b>. IOS Press. pp. 123\u2013129. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.3233%2F978-1-61499-417-6-123\" target=\"_blank\">10.3233\/978-1-61499-417-6-123<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a> 9781614994176.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=EarthServer+for+Geological+applications+%E2%80%93+Opening+up+access+to+big+data+using+OGC+web+services&rft.atitle=Information+Technology+in+Geo-Engineering&rft.aulast=Passmore%2C+J.%3B+Laxton.+J.%3B+Sen%2C+M.&rft.au=Passmore%2C+J.%3B+Laxton.+J.%3B+Sen%2C+M.&rft.date=2014&rft.series=Advances+in+Soil+Mechanics+and+Geotechnical+Engineering&rft.volume=3&rft.pages=pp.%26nbsp%3B123%E2%80%93129&rft.pub=IOS+Press&rft_id=info:doi\/10.3233%2F978-1-61499-417-6-123&rft.isbn=9781614994176&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LiTheSpatial10-44\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-LiTheSpatial10_44-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Li, C.; Song, M.; Lv, X. et al. (2010). \"The Spatial Data Sharing Mechanisms of Geological Survey Information Grid in P2P Mixed Network Systems Network Architecture Model\". <i>Proceedings from the 9th International Conference on Grid and Cooperative Computing<\/i> <b>2010<\/b>: 258\u2013263. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FGCC.2010.59\" target=\"_blank\">10.1109\/GCC.2010.59<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+Spatial+Data+Sharing+Mechanisms+of+Geological+Survey+Information+Grid+in+P2P+Mixed+Network+Systems+Network+Architecture+Model&rft.jtitle=Proceedings+from+the+9th+International+Conference+on+Grid+and+Cooperative+Computing&rft.aulast=Li%2C+C.%3B+Song%2C+M.%3B+Lv%2C+X.+et+al.&rft.au=Li%2C+C.%3B+Song%2C+M.%3B+Lv%2C+X.+et+al.&rft.date=2010&rft.volume=2010&rft.pages=258%E2%80%93263&rft_id=info:doi\/10.1109%2FGCC.2010.59&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CruzAuto12-45\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CruzAuto12_45-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Cruz, S.A.B.; Monteiro, A.M.V.; Santos, R (2012). \"Automated geospatial Web Services composition based on geodata quality requirements\". <i>Computers & Geosciences<\/i> <b>47<\/b>: 60\u201374. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.cageo.2011.11.020\" target=\"_blank\">10.1016\/j.cageo.2011.11.020<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Automated+geospatial+Web+Services+composition+based+on+geodata+quality+requirements&rft.jtitle=Computers+%26+Geosciences&rft.aulast=Cruz%2C+S.A.B.%3B+Monteiro%2C+A.M.V.%3B+Santos%2C+R&rft.au=Cruz%2C+S.A.B.%3B+Monteiro%2C+A.M.V.%3B+Santos%2C+R&rft.date=2012&rft.volume=47&rft.pages=60%E2%80%9374&rft_id=info:doi\/10.1016%2Fj.cageo.2011.11.020&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-XiaForming15-46\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-XiaForming15_46-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Xia, J.; Yang, C.; Liu, K. et al. (2015). \"Forming a global monitoring mechanism and a spatiotemporal performance model for geospatial services\". <i>International Journal of Geographical Information Science<\/i> <b>29<\/b> (3): 375-396. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1080%2F13658816.2014.968783\" target=\"_blank\">10.1080\/13658816.2014.968783<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Forming+a+global+monitoring+mechanism+and+a+spatiotemporal+performance+model+for+geospatial+services&rft.jtitle=International+Journal+of+Geographical+Information+Science&rft.aulast=Xia%2C+J.%3B+Yang%2C+C.%3B+Liu%2C+K.+et+al.&rft.au=Xia%2C+J.%3B+Yang%2C+C.%3B+Liu%2C+K.+et+al.&rft.date=2015&rft.volume=29&rft.issue=3&rft.pages=375-396&rft_id=info:doi\/10.1080%2F13658816.2014.968783&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-IbrahimEval09-47\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-IbrahimEval09_47-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Ibrahim, S.; Jin, H.; Lu, L. et al. (2009). \"Evaluating MapReduce on Virtual Machines: The Hadoop Case\". <i>IEEE International Conference on Cloud Computing<\/i> <b>2009<\/b>: 519\u2013528. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1007%2F978-3-642-10665-1_47\" target=\"_blank\">10.1007\/978-3-642-10665-1_47<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Evaluating+MapReduce+on+Virtual+Machines%3A+The+Hadoop+Case&rft.jtitle=IEEE+International+Conference+on+Cloud+Computing&rft.aulast=Ibrahim%2C+S.%3B+Jin%2C+H.%3B+Lu%2C+L.+et+al.&rft.au=Ibrahim%2C+S.%3B+Jin%2C+H.%3B+Lu%2C+L.+et+al.&rft.date=2009&rft.volume=2009&rft.pages=519%E2%80%93528&rft_id=info:doi\/10.1007%2F978-3-642-10665-1_47&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-IqbalBig15-48\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-IqbalBig15_48-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Iqbal, M.H.; Soomro, T.R. (2015). \"Big Data Analysis: Apache Storm Perspective\". <i>International Journal of Computer Trends and Technology<\/i> <b>19<\/b> (1): 9-14. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.14445%2F22312803%2FIJCTT-V19P103\" target=\"_blank\">10.14445\/22312803\/IJCTT-V19P103<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Big+Data+Analysis%3A+Apache+Storm+Perspective&rft.jtitle=International+Journal+of+Computer+Trends+and+Technology&rft.aulast=Iqbal%2C+M.H.%3B+Soomro%2C+T.R.&rft.au=Iqbal%2C+M.H.%3B+Soomro%2C+T.R.&rft.date=2015&rft.volume=19&rft.issue=1&rft.pages=9-14&rft_id=info:doi\/10.14445%2F22312803%2FIJCTT-V19P103&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-Reyes-OrtizBig15-49\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-Reyes-OrtizBig15_49-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Reyes-Ortiz, J.L.; Oneto, L.; Anguita, D. (2015). \"Big Data Analytics in the Cloud: Spark on Hadoop vs MPI\/OpenMP on Beowulf\". <i>Procedia Computer Science<\/i> <b>53<\/b>: 121\u2013130. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.procs.2015.07.286\" target=\"_blank\">10.1016\/j.procs.2015.07.286<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Big+Data+Analytics+in+the+Cloud%3A+Spark+on+Hadoop+vs+MPI%2FOpenMP+on+Beowulf&rft.jtitle=Procedia+Computer+Science&rft.aulast=Reyes-Ortiz%2C+J.L.%3B+Oneto%2C+L.%3B+Anguita%2C+D.&rft.au=Reyes-Ortiz%2C+J.L.%3B+Oneto%2C+L.%3B+Anguita%2C+D.&rft.date=2015&rft.volume=53&rft.pages=121%E2%80%93130&rft_id=info:doi\/10.1016%2Fj.procs.2015.07.286&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MengMLib16-50\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-MengMLib16_50-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Meng, X.; Bradley, J.; Yavuz, B. et al. (2016). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/jmlr.org\/papers\/v17\/15-237.html\" target=\"_blank\">\"MLlib: Machine Learning in Apache Spark\"<\/a>. <i>Journal of Machine Learning Research<\/i> <b>17<\/b> (34): 1\u20137<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/jmlr.org\/papers\/v17\/15-237.html\" target=\"_blank\">http:\/\/jmlr.org\/papers\/v17\/15-237.html<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=MLlib%3A+Machine+Learning+in+Apache+Spark&rft.jtitle=Journal+of+Machine+Learning+Research&rft.aulast=Meng%2C+X.%3B+Bradley%2C+J.%3B+Yavuz%2C+B.+et+al.&rft.au=Meng%2C+X.%3B+Bradley%2C+J.%3B+Yavuz%2C+B.+et+al.&rft.date=2016&rft.volume=17&rft.issue=34&rft.pages=1%E2%80%937&rft_id=http%3A%2F%2Fjmlr.org%2Fpapers%2Fv17%2F15-237.html&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GiachettaAFrame-51\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-GiachettaAFrame_51-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Giachetta, R. (2015). \"A framework for processing large scale geospatial and remote sensing data in MapReduce environment\". <i>Computers & Graphics<\/i> <b>49<\/b>: 37\u201346. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.cag.2015.03.003\" target=\"_blank\">10.1016\/j.cag.2015.03.003<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+framework+for+processing+large+scale+geospatial+and+remote+sensing+data+in+MapReduce+environment&rft.jtitle=Computers+%26+Graphics&rft.aulast=Giachetta%2C+R.&rft.au=Giachetta%2C+R.&rft.date=2015&rft.volume=49&rft.pages=37%E2%80%9346&rft_id=info:doi\/10.1016%2Fj.cag.2015.03.003&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HuangGeo16-52\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HuangGeo16_52-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Huang, S.; Lui, X. (2016). \"Geological Data Informatization and Standardization Based on Geological Big Data\". <i>Coal Geology of China<\/i> <b>28<\/b> (7): 74\u201378. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.3969%2Fj.issn.1674-1803.2016.07.17\" target=\"_blank\">10.3969\/j.issn.1674-1803.2016.07.17<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Geological+Data+Informatization+and+Standardization+Based+on+Geological+Big+Data&rft.jtitle=Coal+Geology+of+China&rft.aulast=Huang%2C+S.%3B+Lui%2C+X.&rft.au=Huang%2C+S.%3B+Lui%2C+X.&rft.date=2016&rft.volume=28&rft.issue=7&rft.pages=74%E2%80%9378&rft_id=info:doi\/10.3969%2Fj.issn.1674-1803.2016.07.17&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-KouameTheStren17-53\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-KouameTheStren17_53-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Kouame, K.J.A.; Jiang, F.; Feng, T.; Zhu, S. (2017). \"The Strengthening of Geological Infrastructure, Research and Data Acquisition - Using Gis in Ivory Coast Gold Mines\". <i>MATEC Web of Conferences<\/i> <b>95<\/b>: 18001. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1051%2Fmatecconf%2F20179518001\" target=\"_blank\">10.1051\/matecconf\/20179518001<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+Strengthening+of+Geological+Infrastructure%2C+Research+and+Data+Acquisition+-+Using+Gis+in+Ivory+Coast+Gold+Mines&rft.jtitle=MATEC+Web+of+Conferences&rft.aulast=Kouame%2C+K.J.A.%3B+Jiang%2C+F.%3B+Feng%2C+T.%3B+Zhu%2C+S.&rft.au=Kouame%2C+K.J.A.%3B+Jiang%2C+F.%3B+Feng%2C+T.%3B+Zhu%2C+S.&rft.date=2017&rft.volume=95&rft.pages=18001&rft_id=info:doi\/10.1051%2Fmatecconf%2F20179518001&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-KarlssonLife17-54\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-KarlssonLife17_54-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Karlsson, C.S.J.; Miliutenko, S.; Bj\u00f6rklund, A. et al. (2017). \"Life cycle assessment in road infrastructure planning using spatial geological data\". <i>International Journal of Life Cycle Assessment<\/i> <b>22<\/b> (8): 1302\u20131317. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1007%2Fs11367-016-1241-3\" target=\"_blank\">10.1007\/s11367-016-1241-3<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Life+cycle+assessment+in+road+infrastructure+planning+using+spatial+geological+data&rft.jtitle=International+Journal+of+Life+Cycle+Assessment&rft.aulast=Karlsson%2C+C.S.J.%3B+Miliutenko%2C+S.%3B+Bj%C3%B6rklund%2C+A.+et+al.&rft.au=Karlsson%2C+C.S.J.%3B+Miliutenko%2C+S.%3B+Bj%C3%B6rklund%2C+A.+et+al.&rft.date=2017&rft.volume=22&rft.issue=8&rft.pages=1302%E2%80%931317&rft_id=info:doi\/10.1007%2Fs11367-016-1241-3&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-StockToOnt12-55\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-StockToOnt12_55-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Stock, K.; Stojanovic, T.; Reitsma, F. et al. (2012). \"To ontologise or not to ontologise: An information model for a geospatial knowledge infrastructure\". <i>Computers & Geosciences<\/i> <b>45<\/b>: 98-108. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.cageo.2011.10.021\" target=\"_blank\">10.1016\/j.cageo.2011.10.021<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=To+ontologise+or+not+to+ontologise%3A+An+information+model+for+a+geospatial+knowledge+infrastructure&rft.jtitle=Computers+%26+Geosciences&rft.aulast=Stock%2C+K.%3B+Stojanovic%2C+T.%3B+Reitsma%2C+F.+et+al.&rft.au=Stock%2C+K.%3B+Stojanovic%2C+T.%3B+Reitsma%2C+F.+et+al.&rft.date=2012&rft.volume=45&rft.pages=98-108&rft_id=info:doi\/10.1016%2Fj.cageo.2011.10.021&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HunterAWeb16-56\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HunterAWeb16_56-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Hunter, J.; Brooking, C.; Reading, L.; Vink, S. (2016). \"A Web-based system enabling the integration, analysis, and 3D sub-surface visualization of groundwater monitoring data and geological models\". <i>International Journal of Digital Earth<\/i> <b>9<\/b>: 197-214. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1080%2F17538947.2014.1002866\" target=\"_blank\">10.1080\/17538947.2014.1002866<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Web-based+system+enabling+the+integration%2C+analysis%2C+and+3D+sub-surface+visualization+of+groundwater+monitoring+data+and+geological+models&rft.jtitle=International+Journal+of+Digital+Earth&rft.aulast=Hunter%2C+J.%3B+Brooking%2C+C.%3B+Reading%2C+L.%3B+Vink%2C+S.&rft.au=Hunter%2C+J.%3B+Brooking%2C+C.%3B+Reading%2C+L.%3B+Vink%2C+S.&rft.date=2016&rft.volume=9&rft.pages=197-214&rft_id=info:doi\/10.1080%2F17538947.2014.1002866&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-M.C3.BCllerTheGPlates16-57\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-M.C3.BCllerTheGPlates16_57-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">M\u00fcller, R.D.; Qin, X.; Sandwell, D.T. et al. (2016). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4784813\" target=\"_blank\">\"The GPlates Portal: Cloud-Based Interactive 3D Visualization of Global Geophysical and Geological Data in a Web Browser\"<\/a>. <i>PLoS One<\/i> <b>11<\/b> (3): e0150883. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1371%2Fjournal.pone.0150883\" target=\"_blank\">10.1371\/journal.pone.0150883<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4784813\/\" target=\"_blank\">PMC4784813<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/26960151\" target=\"_blank\">26960151<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4784813\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4784813<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+GPlates+Portal%3A+Cloud-Based+Interactive+3D+Visualization+of+Global+Geophysical+and+Geological+Data+in+a+Web+Browser&rft.jtitle=PLoS+One&rft.aulast=M%C3%BCller%2C+R.D.%3B+Qin%2C+X.%3B+Sandwell%2C+D.T.+et+al.&rft.au=M%C3%BCller%2C+R.D.%3B+Qin%2C+X.%3B+Sandwell%2C+D.T.+et+al.&rft.date=2016&rft.volume=11&rft.issue=3&rft.pages=e0150883&rft_id=info:doi\/10.1371%2Fjournal.pone.0150883&rft_id=info:pmc\/PMC4784813&rft_id=info:pmid\/26960151&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4784813&rfr_id=info:sid\/en.wikipedia.org:Journal:Big_data_management_for_cloud-enabled_geological_information_services\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<\/ol><\/div>\n<h2><span class=\"mw-headline\" id=\"Notes\">Notes<\/span><\/h2>\n<p>This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added. Grammar has been updated to make the content more readable.\n<\/p>\n<!-- \nNewPP limit report\nCached time: 20181214185730\nCache expiry: 86400\nDynamic content: false\nCPU time usage: 1.182 seconds\nReal time usage: 1.221 seconds\nPreprocessor visited node count: 41701\/1000000\nPreprocessor generated node count: 42228\/1000000\nPost\u2010expand include size: 289637\/2097152 bytes\nTemplate argument size: 100658\/2097152 bytes\nHighest expansion depth: 18\/40\nExpensive parser function count: 0\/100\n-->\n\n<!-- \nTransclusion expansion time report (%,ms,calls,template)\n100.00% 1162.860 1 - -total\n 88.78% 1032.409 1 - Template:Reflist\n 78.55% 913.438 57 - Template:Citation\/core\n 76.77% 892.769 53 - Template:Cite_journal\n 5.75% 66.817 1 - Template:Infobox_journal_article\n 5.50% 63.961 1 - Template:Infobox\n 5.40% 62.782 47 - Template:Citation\/identifier\n 4.08% 47.465 3 - Template:Cite_web\n 3.74% 43.468 58 - Template:Citation\/make_link\n 3.27% 37.986 80 - Template:Infobox\/row\n-->\n\n<!-- Saved in parser cache with key limswiki:pcache:idhash:10441-0!*!0!!en!5!* and timestamp 20181214185729 and revision id 32616\n -->\n<\/div><div class=\"printfooter\">Source: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Big_data_management_for_cloud-enabled_geological_information_services\">https:\/\/www.limswiki.org\/index.php\/Journal:Big_data_management_for_cloud-enabled_geological_information_services<\/a><\/div>\n\t\t\t\t\t\t\t\t\t\t<!-- end content -->\n\t\t\t\t\t\t\t\t\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t\t<!-- end of the left (by default at least) column -->\n\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t\t\n\t\t<\/div>\n\t\t\n\n<\/body>","ec047b57c5e01fb4daaaffc7b376efce_images":["https:\/\/www.limswiki.org\/images\/c\/c0\/Fig1_ZhuSciProg2018_2018-2018.png","https:\/\/www.limswiki.org\/images\/b\/bb\/Fig2_ZhuSciProg2018_2018-2018.png","https:\/\/www.limswiki.org\/images\/c\/c0\/Fig3_ZhuSciProg2018_2018-2018.png","https:\/\/www.limswiki.org\/images\/3\/33\/Fig4_ZhuSciProg2018_2018-2018.png","https:\/\/www.limswiki.org\/images\/9\/9b\/Fig5_ZhuSciProg2018_2018-2018.png","https:\/\/www.limswiki.org\/images\/2\/21\/Fig6_ZhuSciProg2018_2018-2018.png","https:\/\/www.limswiki.org\/images\/d\/d7\/Fig7_ZhuSciProg2018_2018-2018.png","https:\/\/www.limswiki.org\/images\/9\/9d\/Fig8_ZhuSciProg2018_2018-2018.png"],"ec047b57c5e01fb4daaaffc7b376efce_timestamp":1544813849,"f83633bd19906c97fe01cf5c6de8eb6e_type":"article","f83633bd19906c97fe01cf5c6de8eb6e_title":"Moving ERP systems to the cloud: Data security issues (Saa et al. 2017)","f83633bd19906c97fe01cf5c6de8eb6e_url":"https:\/\/www.limswiki.org\/index.php\/Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues","f83633bd19906c97fe01cf5c6de8eb6e_plaintext":"\n\n\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\n\n\t\t\t\tJournal:Moving ERP systems to the cloud: Data security issues\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tFrom LIMSWiki\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\tJump to: navigation, search\n\n\t\t\t\t\t\n\t\t\t\t\tFull article title\n \nMoving ERP systems to the cloud: Data security issuesJournal\n \nJournal of Information Systems Engineering & ManagementAuthor(s)\n \nSaa, Pablo; Costales, Andr\u00e9s Cueva; Moscoso-Zea, Oswaldo; Lujan-Mora, SergioAuthor affiliation(s)\n \nUniversidad Tecnol\u00f3gica Equinoccial, Yachay Public Company, University of AlicantePrimary contact\n \nEmail: psaa at ute dot edu dot ecYear published\n \n2017Volume and issue\n \n2(4)Page(s)\n \n21DOI\n \n10.20897\/jisem.201721ISSN\n \n2468-4376Distribution license\n \nCreative Commons Attribution 4.0 InternationalWebsite\n \nhttp:\/\/www.lectitopublishing.nl\/Article\/Detail\/8972P1SADownload\n \nhttp:\/\/www.lectitopublishing.nl\/download\/8972P1SA (PDF)\n\nContents\n\n1 Abstract \n2 Introduction \n3 Method \n4 Literature review \n\n4.1 Cloud ERP \n4.2 Traditional ERP vs cloud ERP \n4.3 Data security issues in cloud ERP \n4.4 Confidentiality \n\n4.4.1 Uncertainty around data storage arrangements \n4.4.2 Lack of control over the security protocols and standards \n\n\n4.5 Integrity \n\n4.5.1 Relationship of trust between the cloud provider and client \n4.5.2 Provider\u2019s transaction management standards \n\n\n4.6 Summary of data security issues \n\n\n5 Findings \n6 Recommendations and possible solutions \n7 Conclusion \n8 Acknowledgements \n9 References \n10 Notes \n\n\n\nAbstract \nThis paper brings to light data security issues and concerns for organizations by moving their enterprise resource planning (ERP) systems to the cloud. Cloud computing has become the new trend of how organizations conduct business and has enabled them to innovate and compete in a dynamic environment through new and innovative business models. The growing popularity and success of the cloud has led to the emergence of cloud-based software as a service (SaaS) ERP systems, a new alternative approach to traditional on-premise ERP systems. Cloud-based ERP has a myriad of benefits for organizations. However, infrastructure engineers need to address data security issues before moving their enterprise applications to the cloud. Cloud-based ERP raises specific concerns about the confidentiality and integrity of the data stored in the cloud. Such concerns that affect the adoption of cloud-based ERP are based on the size of the organization. Small to medium enterprises (SMEs) gain the maximum benefits from cloud-based ERP as many of the concerns around data security are not relevant to them. On the contrary, larger organizations are more cautious in moving their mission-critical enterprise applications to the cloud. A hybrid solution where organizations can choose to keep their sensitive applications on-premise while leveraging the benefits of the cloud is proposed in this paper as an effective solution that is gaining momentum and popularity for large organizations.\nKeywords: ERP, cloud computing, cloud ERP, data security, confidentiality, integrity\n\nIntroduction \n\u201cThe cloud\u201d has been a buzzword in the last few years and has caused a revolution in the information and communication technologies (ICT) industry. As IBM states, \u201cCloud computing, often referred to as simply \u2018the cloud,\u2019 is the delivery of on-demand computing resources, everything from applications to data centers over the internet on a pay-for-use basis.\u201d[1] This new trend changes the way organizations deploy services, platforms, and infrastructure of information technologies (IT). The variety of applications and services offered by this new concept affect organizations and individuals who notice the benefits of cloud services in terms of efficiency, flexibility, and reduced investment effort, while technology companies and traditional operators see an opportunity to expand their businesses.[2]\nAccording to Gartner, cloud-based services can be defined as \u201cmassively scalable system capabilities delivered as a service to external users using internet technologies.\u201d[3] A study about cloud computing models describes that based on the completeness and abstraction levels of services delivered to the end user, there are three types of services offered through the cloud, namely infrastructure-as-a-service (IaaS), platform-as-a-service (PaaS), and software-as-a-service (SaaS).[4] \nCloud computing has marked a substantial change in how IT services are developed, implemented, updated, maintained, and paid for. The evolution from traditional service organizations to the emergence of full internet-based service providers, namely through the cloud, enables the provision of flexible, scalable, and economical services.[5]\nIn an environment of global competition, there is growing recognition of the central role of IT in determining the overall success of organizations. The alignment of business objectives, strategic vision, and information technology, combined with strategic planning, could be seen as a key objective to seek efficiency in their operations. Enterprise resource planning (ERP) systems have played an important role in the integration of business functions within organizations to support the generation of products and services.[6] In any modern organization, the term ERP refers to the software used to plan and manage the organization\u2019s resources across all functional areas by integrating the information through those functions and beyond the boundaries of the organization.[7]\nIn today\u2019s highly competitive business landscape, the trend for organizations is to focus their resources and efforts on what they do best and leave the supportive services in the hands of more specialized third parties. The world\u2019s economic model in IT today is moving from \u201cbuy and own\u201d (on-premise) to a subscription-based, pay-per-use (cloud-based) model. The migration from traditional (on-premise) ERP to cloud-based ERP could help organizations to manage their costs efficiently and improve their operations. As such, deploying ERP software in a hosted or on-demand environment could support organizations to improve their business processes and remain competitive.\nCloud-based ERP provides organizations with the possibility to choose the provider that best suits their needs, eliminating inflexible traditional on-premise ERP solutions. However, Lenart[8] argued that while there are many advantages to the use of ERP implemented in a SaaS model, there also are drawbacks, especially those related to security and integrity of the data stored in the cloud.\nHence, the research question explored in this paper is \u201cwhat are the data security issues in cloud-based SaaS ERPs?\u201d\nThe next section presents the methodology used in this study. Following that is a literature review done on cloud-based ERP, comparing the advantages of ERP when adopted as a pay-per-use model versus a traditional on-premise solution. After the literature review several findings are presented on cloud ERP, illustrating the adoption factors and benefits for small, medium, and large organizations. Finally, the paper concludes with recommendations for organizations to ensure the security of sensitive corporate information when adopting cloud-based ERP, as well as the conclusion.\n\nMethod \nThe research approach was based on an exploratory search to review the existing literature on SaaS cloud-based ERPs and their benefits. Additionally, several papers were studied to identify issues on data security, particularly confidentiality and integrity problems that organizations should be aware of before adopting cloud-based ERP solutions. More than 50 articles from 2008 to 2015 were found from several A and A* journals[9] such as Journal of Information Systems, MIS quarterly, Journal of Innovation, Management and Technology, Journal of Systems and Information Technology, International Journal of Computer Applications, and Journal of Network and Computer Applications, among others. Searches were made using remarked academic databases and search engines for computer science and information systems fields: IEEE Xplore, Emerald, ACM Digital Library, Gartner Core Research, Science Direct, and Google Scholar. Furthermore, specific search terms included \u201ccloud ERP,\u201d \u201chybrid ERP,\u201d \u201cimplementation of ERP,\u201d \u201cSaaS ERP,\u201d \u201ccloud computing,\u201d and \u201cdata security issues.\u201d\nAfter reviewing all the articles and papers, key insights and findings were gathered and classified according to the size of organizations. Based on the findings, several recommendations and possible solutions are outlined in this paper.\n\nLiterature review \nCloud ERP \nThe success of cloud computing, combined with the increasing pressure on organizations to respond to unique customer needs in the increasingly competitive business environments of today, has given rise to the new subscription-based delivery model for ERP, also referred to as cloud-based ERP or SaaS ERP. This new model of ERP systems functions in the same way as a traditional on-premise ERP solution. The main difference is that the infrastructure (the software, as well as the hardware and network connection) adopts a pay-per-use model; in other words, ERP is delivered as a service.[7] The ERP in a SaaS model is accessed over the internet, while the application and data is controlled by the cloud service provider and offered as a \u201cready-to-use\u201d product to the end client for a monthly subscription fee.[10]\n\nTraditional ERP vs cloud ERP \nA cloud-based ERP system uses the advantages of cloud computing to offer a new and more flexible approach to host and use ERP systems. A widespread shift from traditional ERP system architecture towards cloud-based SaaS ERP systems is ongoing.[8] The advantages of cloud computing are for example easy usage and accessibility, virtualized resources, scalability, affordability, and availability, guaranteed through service level agreements (SLA).[11] Cloud computing, and in particular SaaS technology, enables ERP systems to invert some of their typical weaknesses which are inflexibility, lack of scalability, and consummation of massive local resources (hardware, manpower, and financial expenditures) into advantages. Although significant concerns remain in the form of limited functionality, the potential loss of internal control, performance reliability, and security, cloud-based models continue to gain traction.[12]\nFigure 1 clearly shows the differences in operating costs, solution complexity, and implementation time of a traditional on-premise ERP system in comparison to cloud-based ERP systems.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 1. ERP systems deployment models[12]\n\n\n\nIn comparison to traditional ERPs, the advantages of cloud-based ERPs include[10]:\n\n enabling smaller clients who are not able to set up a complete, complex ERP system on-premise to use ERP;\n saving infrastructure expenditures (no large up-front capital investment necessary), as well as software, maintenance, and updating costs[13];\n reducing the staff needed for support and maintenance;\n enabling faster implementation of a cloud-based ERP, with less effort needed due to its agile design[13]; and\n offering better scalability (hardware\/performance\/user accounts can be increased quickly when needed but can also be easily reduced as well when resources are not needed anymore);\n enabling mobility (server in the cloud is always accessible, wherever the employee works). \nPossible disadvantages include:\n\n organizational data is stored in the cloud and not on-premise;\n possible integrity and security issues due to loss of control over data storage and systems; and\n dependency on the cloud provider.\nData security issues in cloud ERP \nAs discussed in the previous sections, there is a clear tendency to move enterprise services and systems to the cloud. However, it is important for organizations that want to implement or use an ERP in the cloud (SaaS, PaaS or IaaS) to address the possible issues and risks of migration. Some of the main drawbacks in any cloud-based ERP are related to data security, performance, and availability. Dillon et al.[14] have categorized security of data as the primary concern for organizations. Accordingly, this paper is focused on data security issues for cloud (SaaS) ERP.\nBishop[15] states that computer security relies on the confidentiality, integrity, and availability of data. From that context, cloud computing and ERP systems directly influence the required level of security. For example, as mentioned in the previous sections, ERP systems manage organizational data for essential business operations. Therefore, it is crucial for organizations to ensure data confidentiality and integrity in a cloud environment.\n\nConfidentiality \nWeng and Hung[16] explain that when organizations adopt cloud-based ERP systems, they should be prepared to mitigate the risks around cloud technologies and prevent unauthorized usage of data. In addition, Johansson et al.[17] discover that organizations might feel insecure storing their data at external providers without having direct control over the data. Another problem that might affect the confidentially of data is the lack of control over the staff from the cloud provider, who could access and retrieve data for dishonest or even criminal activities. For instance, Hashizume et al.[18] argue that providers might not perform detailed background checks on their staff which has unlimited access to the cloud data. Consequently, the key challenges to adopting cloud-based ERP are as follows.\n\nUncertainty around data storage arrangements \nWith the SaaS model, the client does not have any control over the IT infrastructure.[19] Moreover, Puthal et al.[20] mention that the same provider often hosts data from several clients in the same data center. This type of hosting increases the risk of data leakage or corporate espionage. On the contrary, with on-premise ERP systems, organizations have absolute control over their data and infrastructure. Consequently, the way in which providers ensure the security and confidentiality of the client\u2019s data is one of the key challenges in the implementation of cloud-based ERP. Furthermore, in cases where the provider also offers public access to specific cloud services, the security challenges are even higher.\n\nLack of control over the security protocols and standards \nEven though the number of reported security incidents from the industry regarding cloud-based ERPs is still small, its rapid adoption increasingly raises security concerns for organizations, much more than traditional on-premise ERPs did.[21] Furthermore, the clients do not have full control or monitoring capabilities about who accesses their data from the provider side.[18] The same applies to the protocols and standards used by providers to hire personnel and to implement or monitor their security infrastructure. Consequently, as these factors are dependent on the provider itself, a high level of uncertainty must be considered when implementing ERP on the cloud.\n\nIntegrity \nThe second main concern of securing enterprise data in the cloud is the need to ensure uniformity of the stored data. As mentioned by Puthal et al.[20], the integrity of data can easily be lost or affected because of cloud providers\u2019 errors and failures. The same authors also argue that the traditional enterprise methods to validate the correctness of data are outside the enterprises\u2019 control; they are the responsibility of the cloud provider. As a consequence, a common method used to ensure data integrity in cloud environments is public auditing. This method uses a third-party verifier that provides expert integrity checking services.[20] Even though the method we mention is commonly used by cloud providers, it raises additional issues like the risk of sensitive information leakage from organizations using cloud providers. From a similar perspective, Akande et al.[22] claim that the methods of authentication and the levels of authorization to manipulate data are crucial concerns for the overall data integrity. \nThe process of selecting and adopting a cloud provider should also take into consideration the following challenges.\n\nRelationship of trust between the cloud provider and client \nAssuring the integrity of data is mainly the responsibility of the provider. Therefore, clients must trust the providers to comply with the agreed-on security measures and protocols to achieve integrity of data. As mentioned by several authors[23][24], the relationship of trust is based not only on the provider\u2019s reputation but also on the specifications of the SLAs between them.\n\nProvider\u2019s transaction management standards \nSubashini and Kavitha[24] argue that in complex settings like cloud computing, there is a high degree of difficulty to assure data integrity. They discuss that the HTTP transaction protocol does not provide guaranteed delivery of data. Additionally, the study shows that SaaS applications should be based on standardized application program interfaces (APIs) as a technological basis for interorganizational systems communication. Standardized APIs ensure that only intended read and write access of data is allowed. However, this best practice to manage data integrity is often not considered by cloud service providers.\n\nSummary of data security issues \nBased on the literature review, Table 1 summarizes the major data security concerns that IT leaders should consider in order to move their ERP systems into the cloud.\n\n\n\n\n\n\n\nTable 1. Data security issues\n\n\nIssue\n\nDescription\n\n\nConfidentiality\n\n\nLack of data control\r\n\nLack of staff control from cloud provider\r\n\nUncertainty on data storage arrangements\r\n\nLack of control over security protocols and standards\r\n\n\n\n\nIntegrity\n\n\nLack of uniformity on stored data\r\n\nInformation leakage by third-parties over organizations using cloud providers\r\n\nLack of trust between the cloud provider and client\r\n\nBeware of provider\u2019s transaction management standards\r\n\n\n\n\nAvailability\n\n\nDepends on cloud provider\n\n\n\n\nFindings \nCloud technologies provide a disruptive alternative to traditional on-premise ERP solutions and are offering innovative ways to generate business value and maintain competitive advantage.[16] In addition to the myriad benefits that cloud-based ERP offers \u2014 such as flexibility, scalability, ease of implementation, and cost savings[12] \u2014 one of the biggest impediments to adopt cloud-based ERP is the risk around data security, namely integrity and confidentiality of the organization's data. In a recent survey conducted by the IDC group, of the 1,100 organizations surveyed on the top inhibitors for cloud-based ERP solutions, 50% of the organizations responded saying security and confidentiality of the data is their primary concern when thinking about moving their enterprise systems to the cloud.[25] See Figure 2 for more.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 2. Top inhibitors for cloud ERP[25]\n\n\n\nSaaS is gaining popularity and is changing the way organizations deploy and use ERP systems. However, the concerns around data integrity and confidentiality need to be addressed before organizations can successfully implement SaaS-based ERP solutions. Additionally, existing literature also shows that adoption rates for cloud-based ERP are highly dependent on the industry type and functions.[26] Given the important role that ERP systems play in the functioning of an organization, having to move mission-critical applications to a third-party cloud vendor and dealing with the associated security issues could negatively impact the SaaS based ERP adoption rates.[10]\nIt can be gathered from literature that due to the low capital expenditure and accelerated time to market, small to medium enterprises (SMEs) benefit from cloud-based ERPs more easily since many of the issues and challenges spin prevalently around data security, confidentiality, and concerns regarding relocating mission critical applications to the cloud, which are often no primary concerns to SMEs.[7][27] The risks associated with storing an organization\u2019s sensitive data on the cloud, and its associated data confidentiality and integrity issues, are less of an inhibitor for SMEs while adopting cloud-based ERP, as they do not possess the financial resources to build and implement an on-premise ERP solution in the first place.[7] SMEs also believe that due to their lack of IT expertise, the security measures that the cloud-based ERP vendors provide are more sophisticated than those that they could implement on-premise. In the long run, the operational expenditure of a cloud-based ERP solution is far less for SMEs, thereby, enabling them to reduce their overall IT expenditure but at the same time allowing them to gain access to state-of-the-art IT infrastructure and expertise through a pay-per-use model.[7] A SaaS ERP solution also gives SMEs the opportunity to effectively channelize their resources to focus on the important aspects of their business, enabling them to maintain their competitive advantage.[7]\nOn the other hand, cloud-based ERP implementations raise a lot of security concerns for larger organizations, as they feel insecure to store their confidential and sensitive information on the cloud, particularly since they have to hand over control to the provider to process the information. Larger organizations are heavily concerned about the probability and impact from a potential security breach that could, for example, damage their reputation, result in financial losses, and in some cases even represent industrial espionage.[7] As a result of these concerns, larger organizations are not motivated to move their mission-critical applications to the cloud, and since they have normally highly skilled internal IT teams, they prefer to implement on-premise ERP systems with high security standards. Another factor that influences larger organizations to continue with their on-premise ERP solutions is the subscription model associated with SaaS-based solutions. Due to the large user base and the number of ERP modules of these organizations, in the long run the subscription fees for cloud-based ERPs are higher than the cost of implementing and maintaining an on-premise solution.[7] Thus, Utzig et al.[12] states, \u201cthe total cost of ownership for a cloud-based solution can be 50% to 60% less than for traditional solutions over a 10-year period.\u201d In other words, large organizations moving their on-premise ERP systems cannot be related with cost savings. A previous study from Utzig et al.[12], represented in Figure 3, demonstrates the cost comparison between on-premise and cloud-based solutions.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 3. Cost comparison of on-premise and cloud-based solutions[12]\n\n\n\nRecommendations and possible solutions \nGiven the existing concerns about data security in cloud-based ERPs, organizations should take proactive measures to ensure that sufficient data security policies and procedures are in place and negotiated with the cloud vendor in order to secure the confidentiality and integrity of sensitive corporate data.[26] Following are some recommendations that organizations \u2014 specifically large enterprises \u2014 should follow before moving their ERP applications to the cloud[16]:\n\n Organizations should negotiate stringent policies and SLAs with cloud vendors to ensure protection of sensitive information stored in the cloud. The policies should clearly outline and define what types of information are classified in which way. \n The internal IT teams and security experts of the organizations should always be involved when evaluating cloud vendors and their security standards. \n Organizations should always perform an extensive analysis and implement control mechanisms before sharing confidential and sensitive information to cloud vendors. \n Organizations should evaluate which applications are critical to their business to maintain their competitive advantage and thereby define strict policies for the information and applications that could be moved to the cloud. \n Cloud vendors should be transparent about their network security infrastructure and should provide this information to the client. \n Organizations should educate their employees by conducting employee education training programs and campaigns about data security risks that are possible in cloud-based ERPs and the necessary actions to mitigate those risks to ensure sensitive corporate information is not compromised.[26]\nIn addition to the above recommendations, organizations should also ensure that a comprehensive security strategy is defined before migrating their enterprise applications to the cloud. Specific security standards need to be enforced at all levels by incorporating a framework that addresses security at the physical, network, data, and application level.[28] A security framework should include components relating to the physical security, data storage security, access security, application security, and transmission security. Physical security policies should include rules of conduct for employees and mechanisms to ensure those rules are being followed.\nStrict access security policies to prevent unauthorized access from internal and external sources should be enforced as well. Application security should include authentication mechanisms to verify the identity of the end users. Data security should always include strong encryption techniques to prevent any possible data leakage.[29] Furthermore, the authentication module should exactly define what level of access each user has.\nAdditionally, mechanisms to ensure integrity of data and to safeguard its uniformity across multiple locations should be put in place. In order to assure confidentiality and integrity of data, its transmission to the provider should be secured by the application of encryption mechanisms. The recommended measurement should be applied on both the provider and the client sides.[28] This should include a contingency plan that allows the organization to have the capability and resources to move to a new cloud provider in case of an emergency in the shortest possible time with the least amount of impact.\nSMEs are more open to move the entirety of their applications to the cloud, whereas larger organizations are still more conservative in their approach due to the risks associated with potential security breaches and their ability to implement high security standards for their on-premise solutions themselves.[7] Thus, SMEs adopt cloud-based ERP solutions at a faster rate than larger organizations. However, a recent development that is gaining popularity and momentum among larger organizations is that of a two-tier ERP strategy also known as hybrid cloud-based ERP. Accordingly, Ruivo et al.[30] argue that more than 77% of IT firms will implement hybrid ERP solutions; however, only over 20% currently have structured plans to implement this technology. In addition, Peng and Gala[23] also consider a hybrid ERP as an effective solution for organizations to keep on-premise ERP core functions combined with business cloud services, before moving to full cloud-based ERP solution.\nHybrid cloud-based ERP provides organizations with the best of both worlds. Organizations can choose to keep their mission-critical applications on-premise while migrating the other modules of the ERP into the cloud. A report from PwC[26] suggests that one of the key aspects of hybrid ERP is allowing organizations to take out functions from on-premise ERP and move them to the cloud, therefore providing organizations with a higher degree of flexibility to support business operations with the use of cloud technology. For instance, the same report shows that the core operations related to inventory, financials, or employee master management could remain as part of the on-premise ERP. This agile and highly flexible approach allows them to implement more sophisticated, customer-driven business models.[31] It enables organizations to take advantage of the cloud-based ERP benefits while minimizing the risks for storing sensitive corporate data on the cloud.[23]\n\nConclusion \nSeveral of cloud computing's benefits encourage organizations to evaluate and implement an ERP system in the cloud, based on the distribution model SaaS. This new approach to ERPs turns some of the weaknesses of traditional ERPs into benefits. The main benefits of cloud-based ERPs are its scalability and lower investment costs, creating opportunities for SMEs.\nHowever, the main weaknesses and threats to this new approach are the security and integrity risks to the data stored in the system, which have been discussed in this paper. Large organizations especially adopt cloud-based ERP systems slowly due to concerns of storing sensitive information on third-party servers securely. The risk of breaches in security and integrity as well as possible misuse of confidential information by the service providers are further drawbacks.\nNevertheless, a new type of solution has begun to take hold in large organizations, one that combines the best of both worlds (cloud and traditional ERPs): hybrid cloud-based ERPs or two-tiered ERPs. Hybrid cloud-based ERPs allow organizations to store their most sensitive data in on-premise solutions while migrating the other modules into a cloud solution. This enables them to benefit from the agility and scalability of cloud-based ERP solutions while still keeping the security advantages from on-premise solutions for their mission-critical data. Another benefit inhered from cloud-based solutions is the ability to deploy services on-demand, reducing the risk associated with the implementation of an entire module for a core on-premise ERP. Moreover, the ability to enhance mobility, system performance, and customization are driving organizations to move to hybrid ERP solutions.[23] Therefore, hybrid cloud-based ERPs are especially suitable for larger organizations, which have been hesitating to move into the cloud with their ERPs so far.\n\nAcknowledgements \nAn initial version of this paper was published as \u201cData Security Issues in Cloud-Based Software-as-a-Service ERP,\u201d in the 12th Iberian Conference on Information Systems and Technologies (CISTI 2017).\n\nReferences \n\n\n\u2191 \"What is cloud computing?\". IBM. https:\/\/www.ibm.com\/cloud\/learn\/what-is-cloud-computing . Retrieved 01 February 2017 .   \n\n\u2191 Lin, A.; Chen, N.-C. (2012). \"Cloud computing as an innovation: Percepetion, attitude, and adoption\". International Journal of Information Management 32 (6): 533\u2013540. doi:10.1016\/j.ijinfomgt.2012.04.001.   \n\n\u2191 \"Cloud Computing\". Garner IT Glossary. Gartner, Inc. https:\/\/www.gartner.com\/it-glossary\/cloud-computing . Retrieved 27 September 2015 .   \n\n\u2191 Gorelik, E. (January 2013). \"Cloud Computing Models\" (PDF). Massachusetts Institute of Technology. http:\/\/web.mit.edu\/smadnick\/www\/wp\/2013-01.pdf .   \n\n\u2191 O'Loughlin, M. (September 2014). \"IT Service Management and Cloud Computing White Paper\". Axelos Limited. https:\/\/www.axelos.com\/case-studies-and-white-papers\/it-service-management-and-cloud-computing . Retrieved 23 January 2017 .   \n\n\u2191 Shehab, E.M.; Sharp, M.W.; Supramaniam, L.; Spedding, T.A. (2004). \"Enterprise resource planning: An integrative review\". Business Process Management Journal 10 (4): 359-386. doi:10.1108\/14637150410548056.   \n\n\u2191 7.0 7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 Johansson, B.; Alajbegovic, A.; Alexopoulos, V.; Desalermos, A. (2014). \"Cloud ERP Adoption Opportunities and Concerns: A Comparison between SMES and Large Companies\". Pre-ECIS 2014 Workshop \"IT Operations Management\". http:\/\/lup.lub.lu.se\/record\/4770066 .   \n\n\u2191 8.0 8.1 Lenart, A. (2011). \"ERP in the Cloud - Benefits and Challenges\". In Wrycza, S.. Research in Systems Analysis and Design: Models and Methods. Lecture Notes in Business Information Processing. 93. Springer. pp. 39\u201350. ISBN 9783642256769.   \n\n\u2191 \"CORE Journal Portal\". Computing Research & Education. http:\/\/portal.core.edu.au\/jnl-ranks\/ . Retrieved 23 January 2017 .   \n\n\u2191 10.0 10.1 10.2 Johansson, B.; Ruivo, P. (2013). \"Exploring Factors for Adopting ERP as SaaS\". Procedia Technology 9: 94\u201399. doi:10.1016\/j.protcy.2013.12.010.   \n\n\u2191 Vaquero, L.M.; Rodero-Merino, L.; Caceres, J.; Lindner, M. (2009). \"A break in the clouds: Towards a cloud definition\". ACM SIGCOMM Computer Communication Review 39 (1): 50\u201355. doi:10.1145\/1496091.1496100.   \n\n\u2191 12.0 12.1 12.2 12.3 12.4 12.5 Utzig, C.; Holland, D.; Horvath, M.; Manohar, M. (2013). \"ERP in the cloud: Is it ready? Are you?\" (PDF). Booz & Company. https:\/\/www.strategyand.pwc.com\/media\/file\/Strategyand_ERP-in-the-Cloud.pdf . Retrieved 01 February 2017 .   \n\n\u2191 13.0 13.1 Elragal, A.; El Kommos, M. (2012). \"In-House versus In-Cloud ERP Systems: A Comparative Study\". Journal of Enterprise Resource Planning Studies 2012 (2012): 659957. doi:10.5171\/2012.659957.   \n\n\u2191 Dillon, T.; Wu, C.; Chang, E. (2010). \"Cloud Computing: Issues and Challenges\". Proceedings of 24th IEEE International Conference on Advanced Information Networking and Applications 2010: 27\u201333. doi:10.1109\/AINA.2010.187.   \n\n\u2191 Bishop, M. (2005). Introduction to Computer Security. Addison=Wesley. ISBN 9780321247445.   \n\n\u2191 16.0 16.1 16.2 Weng, F.; Hung, M.-C. (2014). \"Competition and Challenge on Adopting Cloud ERP\". International Journal of Innovation, Management and Technology 5 (4): 309-313. doi:10.7763\/IJIMT.2014.V5.531.   \n\n\u2191 Johansson, B.; Alajbegovic, A.; Alexopoulo, V.; Desalermos, A. (2015). \"Cloud ERP Adoption Opportunities and Concerns: The Role of Organizational Size\". Proceedings from the 48th Hawaii International Conference on System Sciences (HICSS), 2015 2015: 4211-4219. doi:10.1109\/HICSS.2015.504.   \n\n\u2191 18.0 18.1 Hashizume, K.; Rosado, D.; Fern\u00e1ndez-Medina, E.; Fernandez, E. (2013). \"An analysis of security issues for cloud computing\". Journal of Internet Services and Applications 4: 5. doi:10.1186\/1869-0238-4-5.   \n\n\u2191 Kumar, V.; Garg, K.K. (2012). \"Migration of Services to the Cloud Environment: Challenges and Best Practices\". International Journal of Computer Applications 55 (1): 1\u20136. doi:10.5120\/8716-7105.   \n\n\u2191 20.0 20.1 20.2 Puthal, D.; Sahoo, B.; Mishra, S.; Swain, S. (2015). \"Cloud Computing Features, Issues, and Challenges: A Big Picture\". Proceedings from the International Conference on Computational Intelligence and Networks (CINE), 2015 2015: 116-123. doi:10.1109\/CINE.2015.31.   \n\n\u2191 Castellina, N. (December 2011). \"SaaS and Cloud ERP Trends, Observations, and Performance 2011\" (PDF). Aberdeen Group. http:\/\/www.meritsolutions.com\/resources\/whitepapers\/Aberdeen-Research-SaaS-Cloud-ERP-Trands-2011.pdf .   \n\n\u2191 Akande, A.O.; April, N.A.; Van Belle, J.-P. (2013). \"Management Issues with Cloud Computing\". Proceedings of the Second International Conference on Innovative Computing and Cloud Computing 2013: 119\u2013124. doi:10.1145\/2556871.2556899.   \n\n\u2191 23.0 23.1 23.2 23.3 Peng, G.C.A.; Gala, C. (2014). \"Cloud Erp: A New Dilemma to Modern Organisations?\". Journal of Computer Information Systems 54 (4): 22\u201330. doi:10.1080\/08874417.2014.11645719.   \n\n\u2191 24.0 24.1 Subashini, S.; Kavitha, V. (2011). \"A survey on security issues in service delivery models of cloud computing\". Journal of Network and Computer Applications 34 (1): 1\u201311. doi:10.1016\/j.jnca.2010.07.006.   \n\n\u2191 25.0 25.1 Fauscette, M. (December 2013). \"ERP in the Cloud and the Modern Business\". Oracle. https:\/\/go.oracle.com\/LP=1093?elqCampaignId=2026 . Retrieved 23 January 2017 .   \n\n\u2191 26.0 26.1 26.2 26.3 Clark, N.; Dawson, D.; Heard, K.; Manohar, M. (2014). \"Beyond ERP: New Technology, new options\". Booz & Company. https:\/\/www.strategyand.pwc.com\/reports\/beyond-erp . Retrieved 23 January 2017 .   \n\n\u2191 Waligum, T. (14 August 2008). \"Impact of SaaS on the enterprise ERP market\". InfoWorld. IDG Communications, Inc. https:\/\/www.infoworld.com\/article\/2652900\/applications\/impact-of-saas-on-the-enterprise-erp-market.html . Retrieved 23 January 2017 .   \n\n\u2191 28.0 28.1 Binu, S.; Meenakumari, J. (2012). \"A security framework for an enterprise system on cloud\". Indian Journal of Computer Science and Engineering 3 (4): 548\u2013552. http:\/\/www.ijcse.com\/ijcse-issue.html?issue=20120304 .   \n\n\u2191 Kumbhar, N.N.; Chaudhari, V.V.; Badhe, M.A. (2012). \"The Comprehensive Approach for Data Security in Cloud Computing: A Survey\". International Journal of Computer Applications 39 (18): 23\u201329. doi:10.5120\/5080-7433.   \n\n\u2191 Ruivo, P.; Rodrigues, J.; Oliveira, T. (2015). \"The ERP Surge of Hybrid Models - An Exploratory Research into Five and Ten Years Forecast\". Procedia Computer Science 64: 594\u2013600. doi:10.1016\/j.procs.2015.08.572.   \n\n\u2191 Columbus, L. (27 January 2015). \"Five Catalysts Accelerating Cloud ERP Growth In 2015\". Forbes. http:\/\/www.forbes.com\/sites\/louiscolumbus\/2015\/01\/27\/five-catalysts-accelerating-cloud-erp-growth-in-2015 . Retrieved 23 January 2017 .   \n\n\nNotes \nThis presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added. The original article lists references alphabetically, but this version \u2014 by design \u2014 lists them in order of appearance.\n\n\n\n\n\n\nSource: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues\">https:\/\/www.limswiki.org\/index.php\/Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues<\/a>\n\t\t\t\t\tCategories: LIMSwiki journal articles (added in 2018)LIMSwiki journal articles (all)LIMSwiki journal articles on cloud computingLIMSwiki journal articles on informaticsLIMSwiki journal articles on software\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\n\t\t\n\t\t\tNavigation menu\n\t\t\t\t\t\n\t\t\tViews\n\n\t\t\t\n\t\t\t\t\n\t\t\t\tJournal\n\t\t\t\tDiscussion\n\t\t\t\tView source\n\t\t\t\tHistory\n\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\t\n\t\t\t\tPersonal tools\n\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\t\t\tLog in\n\t\t\t\t\t\t\t\t\t\t\t\t\tRequest account\n\t\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\t\n\t\tNavigation\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tMain page\n\t\t\t\t\t\t\t\t\t\t\tRecent changes\n\t\t\t\t\t\t\t\t\t\t\tRandom page\n\t\t\t\t\t\t\t\t\t\t\tHelp\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tSearch\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t \n\t\t\t\t\t\t\n\t\t\t\t\n\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tTools\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tWhat links here\n\t\t\t\t\t\t\t\t\t\t\tRelated changes\n\t\t\t\t\t\t\t\t\t\t\tSpecial pages\n\t\t\t\t\t\t\t\t\t\t\tPermanent link\n\t\t\t\t\t\t\t\t\t\t\tPage information\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\tPrint\/export\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tCreate a book\n\t\t\t\t\t\t\t\t\t\t\tDownload as PDF\n\t\t\t\t\t\t\t\t\t\t\tDownload as Plain text\n\t\t\t\t\t\t\t\t\t\t\tPrintable version\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\n\t\tSponsors\n\t\t\n\t\t\t \r\n\n\t\r\n\n\t\r\n\n\t\r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n\t\n\t\r\n\n\t\r\n\n\t\r\n\n\t\r\n\t\t\n\t\t\n\t\t\t\n\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t This page was last modified on 6 February 2018, at 21:44.\n\t\t\t\t\t\t\t\t\tThis page has been accessed 939 times.\n\t\t\t\t\t\t\t\t\tContent is available under a Creative Commons Attribution-ShareAlike 4.0 International License unless otherwise noted.\n\t\t\t\t\t\t\t\t\tPrivacy policy\n\t\t\t\t\t\t\t\t\tAbout LIMSWiki\n\t\t\t\t\t\t\t\t\tDisclaimers\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\t\n\n","f83633bd19906c97fe01cf5c6de8eb6e_html":"<body class=\"mediawiki ltr sitedir-ltr ns-206 ns-subject page-Journal_Moving_ERP_systems_to_the_cloud_Data_security_issues skin-monobook action-view\">\n<div id=\"rdp-ebb-globalWrapper\">\n\t\t<div id=\"rdp-ebb-column-content\">\n\t\t\t<div id=\"rdp-ebb-content\" class=\"mw-body\" role=\"main\">\n\t\t\t\t<a id=\"rdp-ebb-top\"><\/a>\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t<h1 id=\"rdp-ebb-firstHeading\" class=\"firstHeading\" lang=\"en\">Journal:Moving ERP systems to the cloud: Data security issues<\/h1>\n\t\t\t\t\n\t\t\t\t<div id=\"rdp-ebb-bodyContent\" class=\"mw-body-content\">\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\n\t\t\t\t\t<!-- start content -->\n\t\t\t\t\t<div id=\"rdp-ebb-mw-content-text\" lang=\"en\" dir=\"ltr\" class=\"mw-content-ltr\">\n\n\n<h2><span class=\"mw-headline\" id=\"Abstract\">Abstract<\/span><\/h2>\n<p>This paper brings to light data security issues and concerns for organizations by moving their <a href=\"https:\/\/www.limswiki.org\/index.php\/Enterprise_resource_planning\" title=\"Enterprise resource planning\" target=\"_blank\" class=\"wiki-link\" data-key=\"07be791b94a208f794e38224f0c0950b\">enterprise resource planning<\/a> (ERP) systems to the cloud. <a href=\"https:\/\/www.limswiki.org\/index.php\/Cloud_computing\" title=\"Cloud computing\" target=\"_blank\" class=\"wiki-link\" data-key=\"fcfe5882eaa018d920cedb88398b604f\">Cloud computing<\/a> has become the new trend of how organizations conduct business and has enabled them to innovate and compete in a dynamic environment through new and innovative business models. The growing popularity and success of the cloud has led to the emergence of cloud-based <a href=\"https:\/\/www.limswiki.org\/index.php\/Software_as_a_service\" title=\"Software as a service\" target=\"_blank\" class=\"wiki-link\" data-key=\"ae8c8a7cd5ee1a264f4f0bbd4a4caedd\">software as a service<\/a> (SaaS) ERP systems, a new alternative approach to traditional on-premise ERP systems. Cloud-based ERP has a myriad of benefits for organizations. However, infrastructure engineers need to address <a href=\"https:\/\/www.limswiki.org\/index.php\/Cloud_computing_security\" title=\"Cloud computing security\" target=\"_blank\" class=\"wiki-link\" data-key=\"e259286aaad13c602098dcbed9e8a4ff\">data security<\/a> issues before moving their enterprise applications to the cloud. Cloud-based ERP raises specific concerns about the confidentiality and <a href=\"https:\/\/www.limswiki.org\/index.php\/Data_integrity\" title=\"Data integrity\" target=\"_blank\" class=\"wiki-link\" data-key=\"382a9bb77ee3e36bb3b37c79ed813167\">integrity<\/a> of the data stored in the cloud. Such concerns that affect the adoption of cloud-based ERP are based on the size of the organization. Small to medium enterprises (SMEs) gain the maximum benefits from cloud-based ERP as many of the concerns around data security are not relevant to them. On the contrary, larger organizations are more cautious in moving their mission-critical enterprise applications to the cloud. A hybrid solution where organizations can choose to keep their sensitive applications on-premise while leveraging the benefits of the cloud is proposed in this paper as an effective solution that is gaining momentum and popularity for large organizations.\n<\/p><p><b>Keywords<\/b>: ERP, cloud computing, cloud ERP, data security, confidentiality, integrity\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Introduction\">Introduction<\/span><\/h2>\n<p>\u201cThe cloud\u201d has been a buzzword in the last few years and has caused a revolution in the information and communication technologies (ICT) industry. As IBM states, \u201cCloud computing, often referred to as simply \u2018the cloud,\u2019 is the delivery of on-demand computing resources, everything from applications to data centers over the internet on a pay-for-use basis.\u201d<sup id=\"rdp-ebb-cite_ref-IBMWhat_1-0\" class=\"reference\"><a href=\"#cite_note-IBMWhat-1\" rel=\"external_link\">[1]<\/a><\/sup> This new trend changes the way organizations deploy services, platforms, and infrastructure of information technologies (IT). The variety of applications and services offered by this new concept affect organizations and individuals who notice the benefits of cloud services in terms of efficiency, flexibility, and reduced investment effort, while technology companies and traditional operators see an opportunity to expand their businesses.<sup id=\"rdp-ebb-cite_ref-LinCloud12_2-0\" class=\"reference\"><a href=\"#cite_note-LinCloud12-2\" rel=\"external_link\">[2]<\/a><\/sup>\n<\/p><p>According to Gartner, cloud-based services can be defined as \u201cmassively scalable system capabilities delivered as a service to external users using internet technologies.\u201d<sup id=\"rdp-ebb-cite_ref-GartnerCloud_3-0\" class=\"reference\"><a href=\"#cite_note-GartnerCloud-3\" rel=\"external_link\">[3]<\/a><\/sup> A study about cloud computing models describes that based on the completeness and abstraction levels of services delivered to the end user, there are three types of services offered through the cloud, namely infrastructure-as-a-service (IaaS), platform-as-a-service (PaaS), and software-as-a-service (SaaS).<sup id=\"rdp-ebb-cite_ref-GorelikCloud13_4-0\" class=\"reference\"><a href=\"#cite_note-GorelikCloud13-4\" rel=\"external_link\">[4]<\/a><\/sup> \n<\/p><p>Cloud computing has marked a substantial change in how IT services are developed, implemented, updated, maintained, and paid for. The evolution from traditional service organizations to the emergence of full internet-based service providers, namely through the cloud, enables the provision of flexible, scalable, and economical services.<sup id=\"rdp-ebb-cite_ref-OLoughlinITServices14_5-0\" class=\"reference\"><a href=\"#cite_note-OLoughlinITServices14-5\" rel=\"external_link\">[5]<\/a><\/sup>\n<\/p><p>In an environment of global competition, there is growing recognition of the central role of IT in determining the overall success of organizations. The alignment of business objectives, strategic vision, and information technology, combined with strategic planning, could be seen as a key objective to seek efficiency in their operations. Enterprise resource planning (ERP) systems have played an important role in the integration of business functions within organizations to support the generation of products and services.<sup id=\"rdp-ebb-cite_ref-ShehabEnter04_6-0\" class=\"reference\"><a href=\"#cite_note-ShehabEnter04-6\" rel=\"external_link\">[6]<\/a><\/sup> In any modern organization, the term ERP refers to the software used to plan and manage the organization\u2019s resources across all functional areas by integrating the <a href=\"https:\/\/www.limswiki.org\/index.php\/Information\" title=\"Information\" target=\"_blank\" class=\"wiki-link\" data-key=\"6300a14d9c2776dcca0999b5ed940e7d\">information<\/a> through those functions and beyond the boundaries of the organization.<sup id=\"rdp-ebb-cite_ref-JohanssonCloud14_7-0\" class=\"reference\"><a href=\"#cite_note-JohanssonCloud14-7\" rel=\"external_link\">[7]<\/a><\/sup>\n<\/p><p>In today\u2019s highly competitive business landscape, the trend for organizations is to focus their resources and efforts on what they do best and leave the supportive services in the hands of more specialized third parties. The world\u2019s economic model in IT today is moving from \u201cbuy and own\u201d (on-premise) to a subscription-based, pay-per-use (cloud-based) model. The migration from traditional (on-premise) ERP to cloud-based ERP could help organizations to manage their costs efficiently and improve their operations. As such, deploying ERP software in a hosted or on-demand environment could support organizations to improve their business processes and remain competitive.\n<\/p><p>Cloud-based ERP provides organizations with the possibility to choose the provider that best suits their needs, eliminating inflexible traditional on-premise ERP solutions. However, Lenart<sup id=\"rdp-ebb-cite_ref-LenartERP11_8-0\" class=\"reference\"><a href=\"#cite_note-LenartERP11-8\" rel=\"external_link\">[8]<\/a><\/sup> argued that while there are many advantages to the use of ERP implemented in a SaaS model, there also are drawbacks, especially those related to security and integrity of the data stored in the cloud.\n<\/p><p>Hence, the research question explored in this paper is \u201cwhat are the data security issues in cloud-based SaaS ERPs?\u201d\n<\/p><p>The next section presents the methodology used in this study. Following that is a literature review done on cloud-based ERP, comparing the advantages of ERP when adopted as a pay-per-use model versus a traditional on-premise solution. After the literature review several findings are presented on cloud ERP, illustrating the adoption factors and benefits for small, medium, and large organizations. Finally, the paper concludes with recommendations for organizations to ensure the security of sensitive corporate information when adopting cloud-based ERP, as well as the conclusion.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Method\">Method<\/span><\/h2>\n<p>The research approach was based on an exploratory search to review the existing literature on SaaS cloud-based ERPs and their benefits. Additionally, several papers were studied to identify issues on data security, particularly confidentiality and integrity problems that organizations should be aware of before adopting cloud-based ERP solutions. More than 50 articles from 2008 to 2015 were found from several A and A* journals<sup id=\"rdp-ebb-cite_ref-COREJP_9-0\" class=\"reference\"><a href=\"#cite_note-COREJP-9\" rel=\"external_link\">[9]<\/a><\/sup> such as <i>Journal of Information Systems<\/i>, <i>MIS quarterly<\/i>, <i>Journal of Innovation<\/i>, <i>Management and Technology<\/i>, <i>Journal of Systems and Information Technology<\/i>, <i>International Journal of Computer Applications<\/i>, and <i>Journal of Network and Computer Applications<\/i>, among others. Searches were made using remarked academic databases and search engines for computer science and information systems fields: IEEE Xplore, Emerald, ACM Digital Library, Gartner Core Research, Science Direct, and Google Scholar. Furthermore, specific search terms included \u201ccloud ERP,\u201d \u201chybrid ERP,\u201d \u201cimplementation of ERP,\u201d \u201cSaaS ERP,\u201d \u201ccloud computing,\u201d and \u201cdata security issues.\u201d\n<\/p><p>After reviewing all the articles and papers, key insights and findings were gathered and classified according to the size of organizations. Based on the findings, several recommendations and possible solutions are outlined in this paper.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Literature_review\">Literature review<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"Cloud_ERP\">Cloud ERP<\/span><\/h3>\n<p>The success of cloud computing, combined with the increasing pressure on organizations to respond to unique customer needs in the increasingly competitive business environments of today, has given rise to the new subscription-based delivery model for ERP, also referred to as cloud-based ERP or SaaS ERP. This new model of ERP systems functions in the same way as a traditional on-premise ERP solution. The main difference is that the infrastructure (the software, as well as the hardware and network connection) adopts a pay-per-use model; in other words, ERP is delivered as a service.<sup id=\"rdp-ebb-cite_ref-JohanssonCloud14_7-1\" class=\"reference\"><a href=\"#cite_note-JohanssonCloud14-7\" rel=\"external_link\">[7]<\/a><\/sup> The ERP in a SaaS model is accessed over the internet, while the application and data is controlled by the cloud service provider and offered as a \u201cready-to-use\u201d product to the end client for a monthly subscription fee.<sup id=\"rdp-ebb-cite_ref-JohanssonExploring13_10-0\" class=\"reference\"><a href=\"#cite_note-JohanssonExploring13-10\" rel=\"external_link\">[10]<\/a><\/sup>\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Traditional_ERP_vs_cloud_ERP\">Traditional ERP vs cloud ERP<\/span><\/h3>\n<p>A cloud-based ERP system uses the advantages of cloud computing to offer a new and more flexible approach to host and use ERP systems. A widespread shift from traditional ERP system architecture towards cloud-based SaaS ERP systems is ongoing.<sup id=\"rdp-ebb-cite_ref-LenartERP11_8-1\" class=\"reference\"><a href=\"#cite_note-LenartERP11-8\" rel=\"external_link\">[8]<\/a><\/sup> The advantages of cloud computing are for example easy usage and accessibility, virtualized resources, scalability, affordability, and availability, guaranteed through service level agreements (SLA).<sup id=\"rdp-ebb-cite_ref-VaqueroABreak09_11-0\" class=\"reference\"><a href=\"#cite_note-VaqueroABreak09-11\" rel=\"external_link\">[11]<\/a><\/sup> Cloud computing, and in particular SaaS technology, enables ERP systems to invert some of their typical weaknesses which are inflexibility, lack of scalability, and consummation of massive local resources (hardware, manpower, and financial expenditures) into advantages. Although significant concerns remain in the form of limited functionality, the potential loss of internal control, performance reliability, and security, cloud-based models continue to gain traction.<sup id=\"rdp-ebb-cite_ref-UtzigERP13_12-0\" class=\"reference\"><a href=\"#cite_note-UtzigERP13-12\" rel=\"external_link\">[12]<\/a><\/sup>\n<\/p><p>Figure 1 clearly shows the differences in operating costs, solution complexity, and implementation time of a traditional on-premise ERP system in comparison to cloud-based ERP systems.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig1_Saa_JofInfoSysEngMan2017_2-4.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"c0b44ed8e60a0649bbed4efe05f69df8\"><img alt=\"Fig1 Saa JofInfoSysEngMan2017 2-4.png\" src=\"https:\/\/www.limswiki.org\/images\/f\/f6\/Fig1_Saa_JofInfoSysEngMan2017_2-4.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 1.<\/b> ERP systems deployment models<sup id=\"rdp-ebb-cite_ref-UtzigERP13_12-1\" class=\"reference\"><a href=\"#cite_note-UtzigERP13-12\" rel=\"external_link\">[12]<\/a><\/sup><\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>In comparison to traditional ERPs, the advantages of cloud-based ERPs include<sup id=\"rdp-ebb-cite_ref-JohanssonExploring13_10-1\" class=\"reference\"><a href=\"#cite_note-JohanssonExploring13-10\" rel=\"external_link\">[10]<\/a><\/sup>:\n<\/p>\n<ul><li> enabling smaller clients who are not able to set up a complete, complex ERP system on-premise to use ERP;<\/li>\n<li> saving infrastructure expenditures (no large up-front capital investment necessary), as well as software, maintenance, and updating costs<sup id=\"rdp-ebb-cite_ref-ElragalInHouse12_13-0\" class=\"reference\"><a href=\"#cite_note-ElragalInHouse12-13\" rel=\"external_link\">[13]<\/a><\/sup>;<\/li>\n<li> reducing the staff needed for support and maintenance;<\/li>\n<li> enabling faster implementation of a cloud-based ERP, with less effort needed due to its agile design<sup id=\"rdp-ebb-cite_ref-ElragalInHouse12_13-1\" class=\"reference\"><a href=\"#cite_note-ElragalInHouse12-13\" rel=\"external_link\">[13]<\/a><\/sup>; and<\/li>\n<li> offering better scalability (hardware\/performance\/user accounts can be increased quickly when needed but can also be easily reduced as well when resources are not needed anymore);<\/li>\n<li> enabling mobility (server in the cloud is always accessible, wherever the employee works). <\/li><\/ul>\n<p>Possible disadvantages include:\n<\/p>\n<ul><li> organizational data is stored in the cloud and not on-premise;<\/li>\n<li> possible integrity and security issues due to loss of control over data storage and systems; and<\/li>\n<li> dependency on the cloud provider.<\/li><\/ul>\n<h3><span class=\"mw-headline\" id=\"Data_security_issues_in_cloud_ERP\">Data security issues in cloud ERP<\/span><\/h3>\n<p>As discussed in the previous sections, there is a clear tendency to move enterprise services and systems to the cloud. However, it is important for organizations that want to implement or use an ERP in the cloud (SaaS, PaaS or IaaS) to address the possible issues and risks of migration. Some of the main drawbacks in any cloud-based ERP are related to data security, performance, and availability. Dillon <i>et al.<\/i><sup id=\"rdp-ebb-cite_ref-DillonCloud10_14-0\" class=\"reference\"><a href=\"#cite_note-DillonCloud10-14\" rel=\"external_link\">[14]<\/a><\/sup> have categorized security of data as the primary concern for organizations. Accordingly, this paper is focused on data security issues for cloud (SaaS) ERP.\n<\/p><p>Bishop<sup id=\"rdp-ebb-cite_ref-BishopIntro05_15-0\" class=\"reference\"><a href=\"#cite_note-BishopIntro05-15\" rel=\"external_link\">[15]<\/a><\/sup> states that computer security relies on the confidentiality, integrity, and availability of data. From that context, cloud computing and ERP systems directly influence the required level of security. For example, as mentioned in the previous sections, ERP systems manage organizational data for essential business operations. Therefore, it is crucial for organizations to ensure data confidentiality and integrity in a cloud environment.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Confidentiality\">Confidentiality<\/span><\/h3>\n<p>Weng and Hung<sup id=\"rdp-ebb-cite_ref-WengCompet14_16-0\" class=\"reference\"><a href=\"#cite_note-WengCompet14-16\" rel=\"external_link\">[16]<\/a><\/sup> explain that when organizations adopt cloud-based ERP systems, they should be prepared to mitigate the risks around cloud technologies and prevent unauthorized usage of data. In addition, Johansson <i>et al.<\/i><sup id=\"rdp-ebb-cite_ref-JohanssonCloud15_17-0\" class=\"reference\"><a href=\"#cite_note-JohanssonCloud15-17\" rel=\"external_link\">[17]<\/a><\/sup> discover that organizations might feel insecure storing their data at external providers without having direct control over the data. Another problem that might affect the confidentially of data is the lack of control over the staff from the cloud provider, who could access and retrieve data for dishonest or even criminal activities. For instance, Hashizume <i>et al.<\/i><sup id=\"rdp-ebb-cite_ref-HashizumeAnAnal13_18-0\" class=\"reference\"><a href=\"#cite_note-HashizumeAnAnal13-18\" rel=\"external_link\">[18]<\/a><\/sup> argue that providers might not perform detailed background checks on their staff which has unlimited access to the cloud data. Consequently, the key challenges to adopting cloud-based ERP are as follows.\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Uncertainty_around_data_storage_arrangements\">Uncertainty around data storage arrangements<\/span><\/h4>\n<p>With the SaaS model, the client does not have any control over the IT infrastructure.<sup id=\"rdp-ebb-cite_ref-KumarMigration12_19-0\" class=\"reference\"><a href=\"#cite_note-KumarMigration12-19\" rel=\"external_link\">[19]<\/a><\/sup> Moreover, Puthal <i>et al.<\/i><sup id=\"rdp-ebb-cite_ref-PuthalCloud15_20-0\" class=\"reference\"><a href=\"#cite_note-PuthalCloud15-20\" rel=\"external_link\">[20]<\/a><\/sup> mention that the same provider often hosts data from several clients in the same data center. This type of hosting increases the risk of data leakage or corporate espionage. On the contrary, with on-premise ERP systems, organizations have absolute control over their data and infrastructure. Consequently, the way in which providers ensure the security and confidentiality of the client\u2019s data is one of the key challenges in the implementation of cloud-based ERP. Furthermore, in cases where the provider also offers public access to specific cloud services, the security challenges are even higher.\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Lack_of_control_over_the_security_protocols_and_standards\">Lack of control over the security protocols and standards<\/span><\/h4>\n<p>Even though the number of reported security incidents from the industry regarding cloud-based ERPs is still small, its rapid adoption increasingly raises security concerns for organizations, much more than traditional on-premise ERPs did.<sup id=\"rdp-ebb-cite_ref-CastellinaSaaS11_21-0\" class=\"reference\"><a href=\"#cite_note-CastellinaSaaS11-21\" rel=\"external_link\">[21]<\/a><\/sup> Furthermore, the clients do not have full control or monitoring capabilities about who accesses their data from the provider side.<sup id=\"rdp-ebb-cite_ref-HashizumeAnAnal13_18-1\" class=\"reference\"><a href=\"#cite_note-HashizumeAnAnal13-18\" rel=\"external_link\">[18]<\/a><\/sup> The same applies to the protocols and standards used by providers to hire personnel and to implement or monitor their security infrastructure. Consequently, as these factors are dependent on the provider itself, a high level of uncertainty must be considered when implementing ERP on the cloud.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Integrity\">Integrity<\/span><\/h3>\n<p>The second main concern of securing enterprise data in the cloud is the need to ensure uniformity of the stored data. As mentioned by Puthal <i>et al.<\/i><sup id=\"rdp-ebb-cite_ref-PuthalCloud15_20-1\" class=\"reference\"><a href=\"#cite_note-PuthalCloud15-20\" rel=\"external_link\">[20]<\/a><\/sup>, the integrity of data can easily be lost or affected because of cloud providers\u2019 errors and failures. The same authors also argue that the traditional enterprise methods to validate the correctness of data are outside the enterprises\u2019 control; they are the responsibility of the cloud provider. As a consequence, a common method used to ensure data integrity in cloud environments is public auditing. This method uses a third-party verifier that provides expert integrity checking services.<sup id=\"rdp-ebb-cite_ref-PuthalCloud15_20-2\" class=\"reference\"><a href=\"#cite_note-PuthalCloud15-20\" rel=\"external_link\">[20]<\/a><\/sup> Even though the method we mention is commonly used by cloud providers, it raises additional issues like the risk of sensitive information leakage from organizations using cloud providers. From a similar perspective, Akande <i>et al.<\/i><sup id=\"rdp-ebb-cite_ref-AkandeManag13_22-0\" class=\"reference\"><a href=\"#cite_note-AkandeManag13-22\" rel=\"external_link\">[22]<\/a><\/sup> claim that the methods of authentication and the levels of authorization to manipulate data are crucial concerns for the overall data integrity. \n<\/p><p>The process of selecting and adopting a cloud provider should also take into consideration the following challenges.\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Relationship_of_trust_between_the_cloud_provider_and_client\">Relationship of trust between the cloud provider and client<\/span><\/h4>\n<p>Assuring the integrity of data is mainly the responsibility of the provider. Therefore, clients must trust the providers to comply with the agreed-on security measures and protocols to achieve integrity of data. As mentioned by several authors<sup id=\"rdp-ebb-cite_ref-PengCloud15_23-0\" class=\"reference\"><a href=\"#cite_note-PengCloud15-23\" rel=\"external_link\">[23]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-SubashiniASurv11_24-0\" class=\"reference\"><a href=\"#cite_note-SubashiniASurv11-24\" rel=\"external_link\">[24]<\/a><\/sup>, the relationship of trust is based not only on the provider\u2019s reputation but also on the specifications of the SLAs between them.\n<\/p>\n<h4><span class=\"mw-headline\" id=\"Provider.E2.80.99s_transaction_management_standards\">Provider\u2019s transaction management standards<\/span><\/h4>\n<p>Subashini and Kavitha<sup id=\"rdp-ebb-cite_ref-SubashiniASurv11_24-1\" class=\"reference\"><a href=\"#cite_note-SubashiniASurv11-24\" rel=\"external_link\">[24]<\/a><\/sup> argue that in complex settings like cloud computing, there is a high degree of difficulty to assure data integrity. They discuss that the HTTP transaction protocol does not provide guaranteed delivery of data. Additionally, the study shows that SaaS applications should be based on standardized application program interfaces (APIs) as a technological basis for interorganizational systems communication. Standardized APIs ensure that only intended read and write access of data is allowed. However, this best practice to manage data integrity is often not considered by cloud service providers.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Summary_of_data_security_issues\">Summary of data security issues<\/span><\/h3>\n<p>Based on the literature review, Table 1 summarizes the major data security concerns that IT leaders should consider in order to move their ERP systems into the cloud.\n<\/p>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table class=\"wikitable\" border=\"1\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\" colspan=\"2\"><b>Table 1.<\/b> Data security issues\n<\/td><\/tr>\n<tr>\n<th style=\"background-color:#dddddd; padding-left:10px; padding-right:10px;\">Issue\n<\/th>\n<th style=\"background-color:#dddddd; padding-left:10px; padding-right:10px;\">Description\n<\/th><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Confidentiality\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<p>Lack of data control<br \/>\nLack of staff control from cloud provider<br \/>\nUncertainty on data storage arrangements<br \/>\nLack of control over security protocols and standards<br \/>\n<\/p>\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Integrity\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<p>Lack of uniformity on stored data<br \/>\nInformation leakage by third-parties over organizations using cloud providers<br \/>\nLack of trust between the cloud provider and client<br \/>\nBeware of provider\u2019s transaction management standards<br \/>\n<\/p>\n<\/td><\/tr>\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">Availability\n<\/td>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\">\n<p>Depends on cloud provider\n<\/p>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h2><span class=\"mw-headline\" id=\"Findings\">Findings<\/span><\/h2>\n<p>Cloud technologies provide a disruptive alternative to traditional on-premise ERP solutions and are offering innovative ways to generate business value and maintain competitive advantage.<sup id=\"rdp-ebb-cite_ref-WengCompet14_16-1\" class=\"reference\"><a href=\"#cite_note-WengCompet14-16\" rel=\"external_link\">[16]<\/a><\/sup> In addition to the myriad benefits that cloud-based ERP offers \u2014 such as flexibility, scalability, ease of implementation, and cost savings<sup id=\"rdp-ebb-cite_ref-UtzigERP13_12-2\" class=\"reference\"><a href=\"#cite_note-UtzigERP13-12\" rel=\"external_link\">[12]<\/a><\/sup> \u2014 one of the biggest impediments to adopt cloud-based ERP is the risk around data security, namely integrity and confidentiality of the organization's data. In a recent survey conducted by the IDC group, of the 1,100 organizations surveyed on the top inhibitors for cloud-based ERP solutions, 50% of the organizations responded saying security and confidentiality of the data is their primary concern when thinking about moving their enterprise systems to the cloud.<sup id=\"rdp-ebb-cite_ref-FauscetteERP13_25-0\" class=\"reference\"><a href=\"#cite_note-FauscetteERP13-25\" rel=\"external_link\">[25]<\/a><\/sup> See Figure 2 for more.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig2_Saa_JofInfoSysEngMan2017_2-4.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"1b025de50cb7a464820981240b3a7654\"><img alt=\"Fig2 Saa JofInfoSysEngMan2017 2-4.png\" src=\"https:\/\/www.limswiki.org\/images\/a\/a3\/Fig2_Saa_JofInfoSysEngMan2017_2-4.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 2.<\/b> Top inhibitors for cloud ERP<sup id=\"rdp-ebb-cite_ref-FauscetteERP13_25-1\" class=\"reference\"><a href=\"#cite_note-FauscetteERP13-25\" rel=\"external_link\">[25]<\/a><\/sup><\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>SaaS is gaining popularity and is changing the way organizations deploy and use ERP systems. However, the concerns around data integrity and confidentiality need to be addressed before organizations can successfully implement SaaS-based ERP solutions. Additionally, existing literature also shows that adoption rates for cloud-based ERP are highly dependent on the industry type and functions.<sup id=\"rdp-ebb-cite_ref-ClarkBeyond14_26-0\" class=\"reference\"><a href=\"#cite_note-ClarkBeyond14-26\" rel=\"external_link\">[26]<\/a><\/sup> Given the important role that ERP systems play in the functioning of an organization, having to move mission-critical applications to a third-party cloud vendor and dealing with the associated security issues could negatively impact the SaaS based ERP adoption rates.<sup id=\"rdp-ebb-cite_ref-JohanssonExploring13_10-2\" class=\"reference\"><a href=\"#cite_note-JohanssonExploring13-10\" rel=\"external_link\">[10]<\/a><\/sup>\n<\/p><p>It can be gathered from literature that due to the low capital expenditure and accelerated time to market, small to medium enterprises (SMEs) benefit from cloud-based ERPs more easily since many of the issues and challenges spin prevalently around data security, confidentiality, and concerns regarding relocating mission critical applications to the cloud, which are often no primary concerns to SMEs.<sup id=\"rdp-ebb-cite_ref-JohanssonCloud14_7-2\" class=\"reference\"><a href=\"#cite_note-JohanssonCloud14-7\" rel=\"external_link\">[7]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-WaligumImpact08_27-0\" class=\"reference\"><a href=\"#cite_note-WaligumImpact08-27\" rel=\"external_link\">[27]<\/a><\/sup> The risks associated with storing an organization\u2019s sensitive data on the cloud, and its associated data confidentiality and integrity issues, are less of an inhibitor for SMEs while adopting cloud-based ERP, as they do not possess the financial resources to build and implement an on-premise ERP solution in the first place.<sup id=\"rdp-ebb-cite_ref-JohanssonCloud14_7-3\" class=\"reference\"><a href=\"#cite_note-JohanssonCloud14-7\" rel=\"external_link\">[7]<\/a><\/sup> SMEs also believe that due to their lack of IT expertise, the security measures that the cloud-based ERP vendors provide are more sophisticated than those that they could implement on-premise. In the long run, the operational expenditure of a cloud-based ERP solution is far less for SMEs, thereby, enabling them to reduce their overall IT expenditure but at the same time allowing them to gain access to state-of-the-art IT infrastructure and expertise through a pay-per-use model.<sup id=\"rdp-ebb-cite_ref-JohanssonCloud14_7-4\" class=\"reference\"><a href=\"#cite_note-JohanssonCloud14-7\" rel=\"external_link\">[7]<\/a><\/sup> A SaaS ERP solution also gives SMEs the opportunity to effectively channelize their resources to focus on the important aspects of their business, enabling them to maintain their competitive advantage.<sup id=\"rdp-ebb-cite_ref-JohanssonCloud14_7-5\" class=\"reference\"><a href=\"#cite_note-JohanssonCloud14-7\" rel=\"external_link\">[7]<\/a><\/sup>\n<\/p><p>On the other hand, cloud-based ERP implementations raise a lot of security concerns for larger organizations, as they feel insecure to store their confidential and sensitive information on the cloud, particularly since they have to hand over control to the provider to process the information. Larger organizations are heavily concerned about the probability and impact from a potential security breach that could, for example, damage their reputation, result in financial losses, and in some cases even represent industrial espionage.<sup id=\"rdp-ebb-cite_ref-JohanssonCloud14_7-6\" class=\"reference\"><a href=\"#cite_note-JohanssonCloud14-7\" rel=\"external_link\">[7]<\/a><\/sup> As a result of these concerns, larger organizations are not motivated to move their mission-critical applications to the cloud, and since they have normally highly skilled internal IT teams, they prefer to implement on-premise ERP systems with high security standards. Another factor that influences larger organizations to continue with their on-premise ERP solutions is the subscription model associated with SaaS-based solutions. Due to the large user base and the number of ERP modules of these organizations, in the long run the subscription fees for cloud-based ERPs are higher than the cost of implementing and maintaining an on-premise solution.<sup id=\"rdp-ebb-cite_ref-JohanssonCloud14_7-7\" class=\"reference\"><a href=\"#cite_note-JohanssonCloud14-7\" rel=\"external_link\">[7]<\/a><\/sup> Thus, Utzig <i>et al.<\/i><sup id=\"rdp-ebb-cite_ref-UtzigERP13_12-3\" class=\"reference\"><a href=\"#cite_note-UtzigERP13-12\" rel=\"external_link\">[12]<\/a><\/sup> states, \u201cthe total cost of ownership for a cloud-based solution can be 50% to 60% less than for traditional solutions over a 10-year period.\u201d In other words, large organizations moving their on-premise ERP systems cannot be related with cost savings. A previous study from Utzig <i>et al.<\/i><sup id=\"rdp-ebb-cite_ref-UtzigERP13_12-4\" class=\"reference\"><a href=\"#cite_note-UtzigERP13-12\" rel=\"external_link\">[12]<\/a><\/sup>, represented in Figure 3, demonstrates the cost comparison between on-premise and cloud-based solutions.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig3_Saa_JofInfoSysEngMan2017_2-4.png\" class=\"image wiki-link\" target=\"_blank\" data-key=\"e3fd908cac10a4c098236d7fd400e93e\"><img alt=\"Fig3 Saa JofInfoSysEngMan2017 2-4.png\" src=\"https:\/\/www.limswiki.org\/images\/f\/f8\/Fig3_Saa_JofInfoSysEngMan2017_2-4.png\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 3.<\/b> Cost comparison of on-premise and cloud-based solutions<sup id=\"rdp-ebb-cite_ref-UtzigERP13_12-5\" class=\"reference\"><a href=\"#cite_note-UtzigERP13-12\" rel=\"external_link\">[12]<\/a><\/sup><\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h2><span class=\"mw-headline\" id=\"Recommendations_and_possible_solutions\">Recommendations and possible solutions<\/span><\/h2>\n<p>Given the existing concerns about data security in cloud-based ERPs, organizations should take proactive measures to ensure that sufficient data security policies and procedures are in place and negotiated with the cloud vendor in order to secure the confidentiality and integrity of sensitive corporate data.<sup id=\"rdp-ebb-cite_ref-ClarkBeyond14_26-1\" class=\"reference\"><a href=\"#cite_note-ClarkBeyond14-26\" rel=\"external_link\">[26]<\/a><\/sup> Following are some recommendations that organizations \u2014 specifically large enterprises \u2014 should follow before moving their ERP applications to the cloud<sup id=\"rdp-ebb-cite_ref-WengCompet14_16-2\" class=\"reference\"><a href=\"#cite_note-WengCompet14-16\" rel=\"external_link\">[16]<\/a><\/sup>:\n<\/p>\n<ul><li> Organizations should negotiate stringent policies and SLAs with cloud vendors to ensure protection of sensitive information stored in the cloud. The policies should clearly outline and define what types of information are classified in which way. <\/li>\n<li> The internal IT teams and security experts of the organizations should always be involved when evaluating cloud vendors and their security standards. <\/li>\n<li> Organizations should always perform an extensive analysis and implement control mechanisms before sharing confidential and sensitive information to cloud vendors. <\/li>\n<li> Organizations should evaluate which applications are critical to their business to maintain their competitive advantage and thereby define strict policies for the information and applications that could be moved to the cloud. <\/li>\n<li> Cloud vendors should be transparent about their network security infrastructure and should provide this information to the client. <\/li>\n<li> Organizations should educate their employees by conducting employee education training programs and campaigns about data security risks that are possible in cloud-based ERPs and the necessary actions to mitigate those risks to ensure sensitive corporate information is not compromised.<sup id=\"rdp-ebb-cite_ref-ClarkBeyond14_26-2\" class=\"reference\"><a href=\"#cite_note-ClarkBeyond14-26\" rel=\"external_link\">[26]<\/a><\/sup><\/li><\/ul>\n<p>In addition to the above recommendations, organizations should also ensure that a comprehensive security strategy is defined before migrating their enterprise applications to the cloud. Specific security standards need to be enforced at all levels by incorporating a framework that addresses security at the physical, network, data, and application level.<sup id=\"rdp-ebb-cite_ref-BinuASecurity12_28-0\" class=\"reference\"><a href=\"#cite_note-BinuASecurity12-28\" rel=\"external_link\">[28]<\/a><\/sup> A security framework should include components relating to the physical security, data storage security, access security, application security, and transmission security. Physical security policies should include rules of conduct for employees and mechanisms to ensure those rules are being followed.\n<\/p><p>Strict access security policies to prevent unauthorized access from internal and external sources should be enforced as well. Application security should include authentication mechanisms to verify the identity of the end users. Data security should always include strong encryption techniques to prevent any possible data leakage.<sup id=\"rdp-ebb-cite_ref-KumbharTheComp12_29-0\" class=\"reference\"><a href=\"#cite_note-KumbharTheComp12-29\" rel=\"external_link\">[29]<\/a><\/sup> Furthermore, the authentication module should exactly define what level of access each user has.\n<\/p><p>Additionally, mechanisms to ensure integrity of data and to safeguard its uniformity across multiple locations should be put in place. In order to assure confidentiality and integrity of data, its transmission to the provider should be secured by the application of encryption mechanisms. The recommended measurement should be applied on both the provider and the client sides.<sup id=\"rdp-ebb-cite_ref-BinuASecurity12_28-1\" class=\"reference\"><a href=\"#cite_note-BinuASecurity12-28\" rel=\"external_link\">[28]<\/a><\/sup> This should include a contingency plan that allows the organization to have the capability and resources to move to a new cloud provider in case of an emergency in the shortest possible time with the least amount of impact.\n<\/p><p>SMEs are more open to move the entirety of their applications to the cloud, whereas larger organizations are still more conservative in their approach due to the risks associated with potential security breaches and their ability to implement high security standards for their on-premise solutions themselves.<sup id=\"rdp-ebb-cite_ref-JohanssonCloud14_7-8\" class=\"reference\"><a href=\"#cite_note-JohanssonCloud14-7\" rel=\"external_link\">[7]<\/a><\/sup> Thus, SMEs adopt cloud-based ERP solutions at a faster rate than larger organizations. However, a recent development that is gaining popularity and momentum among larger organizations is that of a two-tier ERP strategy also known as hybrid cloud-based ERP. Accordingly, Ruivo <i>et al.<\/i><sup id=\"rdp-ebb-cite_ref-RuivoTheERP15_30-0\" class=\"reference\"><a href=\"#cite_note-RuivoTheERP15-30\" rel=\"external_link\">[30]<\/a><\/sup> argue that more than 77% of IT firms will implement hybrid ERP solutions; however, only over 20% currently have structured plans to implement this technology. In addition, Peng and Gala<sup id=\"rdp-ebb-cite_ref-PengCloud15_23-1\" class=\"reference\"><a href=\"#cite_note-PengCloud15-23\" rel=\"external_link\">[23]<\/a><\/sup> also consider a hybrid ERP as an effective solution for organizations to keep on-premise ERP core functions combined with business cloud services, before moving to full cloud-based ERP solution.\n<\/p><p>Hybrid cloud-based ERP provides organizations with the best of both worlds. Organizations can choose to keep their mission-critical applications on-premise while migrating the other modules of the ERP into the cloud. A report from PwC<sup id=\"rdp-ebb-cite_ref-ClarkBeyond14_26-3\" class=\"reference\"><a href=\"#cite_note-ClarkBeyond14-26\" rel=\"external_link\">[26]<\/a><\/sup> suggests that one of the key aspects of hybrid ERP is allowing organizations to take out functions from on-premise ERP and move them to the cloud, therefore providing organizations with a higher degree of flexibility to support business operations with the use of cloud technology. For instance, the same report shows that the core operations related to inventory, financials, or employee master management could remain as part of the on-premise ERP. This agile and highly flexible approach allows them to implement more sophisticated, customer-driven business models.<sup id=\"rdp-ebb-cite_ref-ColumbusFive15_31-0\" class=\"reference\"><a href=\"#cite_note-ColumbusFive15-31\" rel=\"external_link\">[31]<\/a><\/sup> It enables organizations to take advantage of the cloud-based ERP benefits while minimizing the risks for storing sensitive corporate data on the cloud.<sup id=\"rdp-ebb-cite_ref-PengCloud15_23-2\" class=\"reference\"><a href=\"#cite_note-PengCloud15-23\" rel=\"external_link\">[23]<\/a><\/sup>\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Conclusion\">Conclusion<\/span><\/h2>\n<p>Several of cloud computing's benefits encourage organizations to evaluate and implement an ERP system in the cloud, based on the distribution model SaaS. This new approach to ERPs turns some of the weaknesses of traditional ERPs into benefits. The main benefits of cloud-based ERPs are its scalability and lower investment costs, creating opportunities for SMEs.\n<\/p><p>However, the main weaknesses and threats to this new approach are the security and integrity risks to the data stored in the system, which have been discussed in this paper. Large organizations especially adopt cloud-based ERP systems slowly due to concerns of storing sensitive information on third-party servers securely. The risk of breaches in security and integrity as well as possible misuse of confidential information by the service providers are further drawbacks.\n<\/p><p>Nevertheless, a new type of solution has begun to take hold in large organizations, one that combines the best of both worlds (cloud and traditional ERPs): hybrid cloud-based ERPs or two-tiered ERPs. Hybrid cloud-based ERPs allow organizations to store their most sensitive data in on-premise solutions while migrating the other modules into a cloud solution. This enables them to benefit from the agility and scalability of cloud-based ERP solutions while still keeping the security advantages from on-premise solutions for their mission-critical data. Another benefit inhered from cloud-based solutions is the ability to deploy services on-demand, reducing the risk associated with the implementation of an entire module for a core on-premise ERP. Moreover, the ability to enhance mobility, system performance, and customization are driving organizations to move to hybrid ERP solutions.<sup id=\"rdp-ebb-cite_ref-PengCloud15_23-3\" class=\"reference\"><a href=\"#cite_note-PengCloud15-23\" rel=\"external_link\">[23]<\/a><\/sup> Therefore, hybrid cloud-based ERPs are especially suitable for larger organizations, which have been hesitating to move into the cloud with their ERPs so far.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Acknowledgements\">Acknowledgements<\/span><\/h2>\n<p>An initial version of this paper was published as \u201c<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/ieeexplore.ieee.org\/document\/7975779\/\" target=\"_blank\">Data Security Issues in Cloud-Based Software-as-a-Service ERP<\/a>,\u201d in the 12th Iberian Conference on Information Systems and Technologies (CISTI 2017).\n<\/p>\n<h2><span class=\"mw-headline\" id=\"References\">References<\/span><\/h2>\n<div class=\"reflist references-column-width\" style=\"-moz-column-width: 30em; -webkit-column-width: 30em; column-width: 30em; list-style-type: decimal;\">\n<ol class=\"references\">\n<li id=\"cite_note-IBMWhat-1\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-IBMWhat_1-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.ibm.com\/cloud\/learn\/what-is-cloud-computing\" target=\"_blank\">\"What is cloud computing?\"<\/a>. IBM<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.ibm.com\/cloud\/learn\/what-is-cloud-computing\" target=\"_blank\">https:\/\/www.ibm.com\/cloud\/learn\/what-is-cloud-computing<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 01 February 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=What+is+cloud+computing%3F&rft.atitle=&rft.pub=IBM&rft_id=https%3A%2F%2Fwww.ibm.com%2Fcloud%2Flearn%2Fwhat-is-cloud-computing&rfr_id=info:sid\/en.wikipedia.org:Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LinCloud12-2\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-LinCloud12_2-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Lin, A.; Chen, N.-C. (2012). \"Cloud computing as an innovation: Percepetion, attitude, and adoption\". <i>International Journal of Information Management<\/i> <b>32<\/b> (6): 533\u2013540. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.ijinfomgt.2012.04.001\" target=\"_blank\">10.1016\/j.ijinfomgt.2012.04.001<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Cloud+computing+as+an+innovation%3A+Percepetion%2C+attitude%2C+and+adoption&rft.jtitle=International+Journal+of+Information+Management&rft.aulast=Lin%2C+A.%3B+Chen%2C+N.-C.&rft.au=Lin%2C+A.%3B+Chen%2C+N.-C.&rft.date=2012&rft.volume=32&rft.issue=6&rft.pages=533%E2%80%93540&rft_id=info:doi\/10.1016%2Fj.ijinfomgt.2012.04.001&rfr_id=info:sid\/en.wikipedia.org:Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GartnerCloud-3\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-GartnerCloud_3-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.gartner.com\/it-glossary\/cloud-computing\" target=\"_blank\">\"Cloud Computing\"<\/a>. <i>Garner IT Glossary<\/i>. Gartner, Inc<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.gartner.com\/it-glossary\/cloud-computing\" target=\"_blank\">https:\/\/www.gartner.com\/it-glossary\/cloud-computing<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 27 September 2015<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Cloud+Computing&rft.atitle=Garner+IT+Glossary&rft.pub=Gartner%2C+Inc&rft_id=https%3A%2F%2Fwww.gartner.com%2Fit-glossary%2Fcloud-computing&rfr_id=info:sid\/en.wikipedia.org:Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GorelikCloud13-4\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-GorelikCloud13_4-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Gorelik, E. (January 2013). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/web.mit.edu\/smadnick\/www\/wp\/2013-01.pdf\" target=\"_blank\">\"Cloud Computing Models\"<\/a> (PDF). Massachusetts Institute of Technology<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/web.mit.edu\/smadnick\/www\/wp\/2013-01.pdf\" target=\"_blank\">http:\/\/web.mit.edu\/smadnick\/www\/wp\/2013-01.pdf<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Cloud+Computing+Models&rft.atitle=&rft.aulast=Gorelik%2C+E.&rft.au=Gorelik%2C+E.&rft.date=January+2013&rft.pub=Massachusetts+Institute+of+Technology&rft_id=http%3A%2F%2Fweb.mit.edu%2Fsmadnick%2Fwww%2Fwp%2F2013-01.pdf&rfr_id=info:sid\/en.wikipedia.org:Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-OLoughlinITServices14-5\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-OLoughlinITServices14_5-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">O'Loughlin, M. (September 2014). <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.axelos.com\/case-studies-and-white-papers\/it-service-management-and-cloud-computing\" target=\"_blank\">\"IT Service Management and Cloud Computing White Paper\"<\/a>. Axelos Limited<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.axelos.com\/case-studies-and-white-papers\/it-service-management-and-cloud-computing\" target=\"_blank\">https:\/\/www.axelos.com\/case-studies-and-white-papers\/it-service-management-and-cloud-computing<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 23 January 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=IT+Service+Management+and+Cloud+Computing+White+Paper&rft.atitle=&rft.aulast=O%27Loughlin%2C+M.&rft.au=O%27Loughlin%2C+M.&rft.date=September+2014&rft.pub=Axelos+Limited&rft_id=https%3A%2F%2Fwww.axelos.com%2Fcase-studies-and-white-papers%2Fit-service-management-and-cloud-computing&rfr_id=info:sid\/en.wikipedia.org:Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ShehabEnter04-6\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ShehabEnter04_6-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Shehab, E.M.; Sharp, M.W.; Supramaniam, L.; Spedding, T.A. (2004). \"Enterprise resource planning: An integrative review\". <i>Business Process Management Journal<\/i> <b>10<\/b> (4): 359-386. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1108%2F14637150410548056\" target=\"_blank\">10.1108\/14637150410548056<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Enterprise+resource+planning%3A+An+integrative+review&rft.jtitle=Business+Process+Management+Journal&rft.aulast=Shehab%2C+E.M.%3B+Sharp%2C+M.W.%3B+Supramaniam%2C+L.%3B+Spedding%2C+T.A.&rft.au=Shehab%2C+E.M.%3B+Sharp%2C+M.W.%3B+Supramaniam%2C+L.%3B+Spedding%2C+T.A.&rft.date=2004&rft.volume=10&rft.issue=4&rft.pages=359-386&rft_id=info:doi\/10.1108%2F14637150410548056&rfr_id=info:sid\/en.wikipedia.org:Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-JohanssonCloud14-7\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-JohanssonCloud14_7-0\" rel=\"external_link\">7.0<\/a><\/sup> <sup><a href=\"#cite_ref-JohanssonCloud14_7-1\" rel=\"external_link\">7.1<\/a><\/sup> <sup><a href=\"#cite_ref-JohanssonCloud14_7-2\" rel=\"external_link\">7.2<\/a><\/sup> <sup><a href=\"#cite_ref-JohanssonCloud14_7-3\" rel=\"external_link\">7.3<\/a><\/sup> <sup><a href=\"#cite_ref-JohanssonCloud14_7-4\" rel=\"external_link\">7.4<\/a><\/sup> <sup><a href=\"#cite_ref-JohanssonCloud14_7-5\" rel=\"external_link\">7.5<\/a><\/sup> <sup><a href=\"#cite_ref-JohanssonCloud14_7-6\" rel=\"external_link\">7.6<\/a><\/sup> <sup><a href=\"#cite_ref-JohanssonCloud14_7-7\" rel=\"external_link\">7.7<\/a><\/sup> <sup><a href=\"#cite_ref-JohanssonCloud14_7-8\" rel=\"external_link\">7.8<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Johansson, B.; Alajbegovic, A.; Alexopoulos, V.; Desalermos, A. (2014). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/lup.lub.lu.se\/record\/4770066\" target=\"_blank\">\"Cloud ERP Adoption Opportunities and Concerns: A Comparison between SMES and Large Companies\"<\/a>. <i>Pre-ECIS 2014 Workshop \"IT Operations Management\"<\/i><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/lup.lub.lu.se\/record\/4770066\" target=\"_blank\">http:\/\/lup.lub.lu.se\/record\/4770066<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Cloud+ERP+Adoption+Opportunities+and+Concerns%3A+A+Comparison+between+SMES+and+Large+Companies&rft.jtitle=Pre-ECIS+2014+Workshop+%22IT+Operations+Management%22&rft.aulast=Johansson%2C+B.%3B+Alajbegovic%2C+A.%3B+Alexopoulos%2C+V.%3B+Desalermos%2C+A.&rft.au=Johansson%2C+B.%3B+Alajbegovic%2C+A.%3B+Alexopoulos%2C+V.%3B+Desalermos%2C+A.&rft.date=2014&rft_id=http%3A%2F%2Flup.lub.lu.se%2Frecord%2F4770066&rfr_id=info:sid\/en.wikipedia.org:Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-LenartERP11-8\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-LenartERP11_8-0\" rel=\"external_link\">8.0<\/a><\/sup> <sup><a href=\"#cite_ref-LenartERP11_8-1\" rel=\"external_link\">8.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation book\">Lenart, A. (2011). \"ERP in the Cloud - Benefits and Challenges\". In Wrycza, S.. <i>Research in Systems Analysis and Design: Models and Methods<\/i>. Lecture Notes in Business Information Processing. <b>93<\/b>. Springer. pp. 39\u201350. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a> 9783642256769.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=ERP+in+the+Cloud+-+Benefits+and+Challenges&rft.atitle=Research+in+Systems+Analysis+and+Design%3A+Models+and+Methods&rft.aulast=Lenart%2C+A.&rft.au=Lenart%2C+A.&rft.date=2011&rft.series=Lecture+Notes+in+Business+Information+Processing&rft.volume=93&rft.pages=pp.%26nbsp%3B39%E2%80%9350&rft.pub=Springer&rft.isbn=9783642256769&rfr_id=info:sid\/en.wikipedia.org:Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-COREJP-9\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-COREJP_9-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"http:\/\/portal.core.edu.au\/jnl-ranks\/\" target=\"_blank\">\"CORE Journal Portal\"<\/a>. Computing Research & Education<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/portal.core.edu.au\/jnl-ranks\/\" target=\"_blank\">http:\/\/portal.core.edu.au\/jnl-ranks\/<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 23 January 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=CORE+Journal+Portal&rft.atitle=&rft.pub=Computing+Research+%26+Education&rft_id=http%3A%2F%2Fportal.core.edu.au%2Fjnl-ranks%2F&rfr_id=info:sid\/en.wikipedia.org:Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-JohanssonExploring13-10\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-JohanssonExploring13_10-0\" rel=\"external_link\">10.0<\/a><\/sup> <sup><a href=\"#cite_ref-JohanssonExploring13_10-1\" rel=\"external_link\">10.1<\/a><\/sup> <sup><a href=\"#cite_ref-JohanssonExploring13_10-2\" rel=\"external_link\">10.2<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Johansson, B.; Ruivo, P. (2013). \"Exploring Factors for Adopting ERP as SaaS\". <i>Procedia Technology<\/i> <b>9<\/b>: 94\u201399. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.protcy.2013.12.010\" target=\"_blank\">10.1016\/j.protcy.2013.12.010<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Exploring+Factors+for+Adopting+ERP+as+SaaS&rft.jtitle=Procedia+Technology&rft.aulast=Johansson%2C+B.%3B+Ruivo%2C+P.&rft.au=Johansson%2C+B.%3B+Ruivo%2C+P.&rft.date=2013&rft.volume=9&rft.pages=94%E2%80%9399&rft_id=info:doi\/10.1016%2Fj.protcy.2013.12.010&rfr_id=info:sid\/en.wikipedia.org:Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-VaqueroABreak09-11\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-VaqueroABreak09_11-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Vaquero, L.M.; Rodero-Merino, L.; Caceres, J.; Lindner, M. (2009). \"A break in the clouds: Towards a cloud definition\". <i>ACM SIGCOMM Computer Communication Review<\/i> <b>39<\/b> (1): 50\u201355. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1145%2F1496091.1496100\" target=\"_blank\">10.1145\/1496091.1496100<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+break+in+the+clouds%3A+Towards+a+cloud+definition&rft.jtitle=ACM+SIGCOMM+Computer+Communication+Review&rft.aulast=Vaquero%2C+L.M.%3B+Rodero-Merino%2C+L.%3B+Caceres%2C+J.%3B+Lindner%2C+M.&rft.au=Vaquero%2C+L.M.%3B+Rodero-Merino%2C+L.%3B+Caceres%2C+J.%3B+Lindner%2C+M.&rft.date=2009&rft.volume=39&rft.issue=1&rft.pages=50%E2%80%9355&rft_id=info:doi\/10.1145%2F1496091.1496100&rfr_id=info:sid\/en.wikipedia.org:Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-UtzigERP13-12\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-UtzigERP13_12-0\" rel=\"external_link\">12.0<\/a><\/sup> <sup><a href=\"#cite_ref-UtzigERP13_12-1\" rel=\"external_link\">12.1<\/a><\/sup> <sup><a href=\"#cite_ref-UtzigERP13_12-2\" rel=\"external_link\">12.2<\/a><\/sup> <sup><a href=\"#cite_ref-UtzigERP13_12-3\" rel=\"external_link\">12.3<\/a><\/sup> <sup><a href=\"#cite_ref-UtzigERP13_12-4\" rel=\"external_link\">12.4<\/a><\/sup> <sup><a href=\"#cite_ref-UtzigERP13_12-5\" rel=\"external_link\">12.5<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation web\">Utzig, C.; Holland, D.; Horvath, M.; Manohar, M. (2013). <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.strategyand.pwc.com\/media\/file\/Strategyand_ERP-in-the-Cloud.pdf\" target=\"_blank\">\"ERP in the cloud: Is it ready? Are you?\"<\/a> (PDF). Booz & Company<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.strategyand.pwc.com\/media\/file\/Strategyand_ERP-in-the-Cloud.pdf\" target=\"_blank\">https:\/\/www.strategyand.pwc.com\/media\/file\/Strategyand_ERP-in-the-Cloud.pdf<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 01 February 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=ERP+in+the+cloud%3A+Is+it+ready%3F+Are+you%3F&rft.atitle=&rft.aulast=Utzig%2C+C.%3B+Holland%2C+D.%3B+Horvath%2C+M.%3B+Manohar%2C+M.&rft.au=Utzig%2C+C.%3B+Holland%2C+D.%3B+Horvath%2C+M.%3B+Manohar%2C+M.&rft.date=2013&rft.pub=Booz+%26+Company&rft_id=https%3A%2F%2Fwww.strategyand.pwc.com%2Fmedia%2Ffile%2FStrategyand_ERP-in-the-Cloud.pdf&rfr_id=info:sid\/en.wikipedia.org:Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ElragalInHouse12-13\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-ElragalInHouse12_13-0\" rel=\"external_link\">13.0<\/a><\/sup> <sup><a href=\"#cite_ref-ElragalInHouse12_13-1\" rel=\"external_link\">13.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Elragal, A.; El Kommos, M. (2012). \"In-House versus In-Cloud ERP Systems: A Comparative Study\". <i>Journal of Enterprise Resource Planning Studies<\/i> <b>2012<\/b> (2012): 659957. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.5171%2F2012.659957\" target=\"_blank\">10.5171\/2012.659957<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=In-House+versus+In-Cloud+ERP+Systems%3A+A+Comparative+Study&rft.jtitle=Journal+of+Enterprise+Resource+Planning+Studies&rft.aulast=Elragal%2C+A.%3B+El+Kommos%2C+M.&rft.au=Elragal%2C+A.%3B+El+Kommos%2C+M.&rft.date=2012&rft.volume=2012&rft.issue=2012&rft.pages=659957&rft_id=info:doi\/10.5171%2F2012.659957&rfr_id=info:sid\/en.wikipedia.org:Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-DillonCloud10-14\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-DillonCloud10_14-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Dillon, T.; Wu, C.; Chang, E. (2010). \"Cloud Computing: Issues and Challenges\". <i>Proceedings of 24th IEEE International Conference on Advanced Information Networking and Applications<\/i> <b>2010<\/b>: 27\u201333. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FAINA.2010.187\" target=\"_blank\">10.1109\/AINA.2010.187<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Cloud+Computing%3A+Issues+and+Challenges&rft.jtitle=Proceedings+of+24th+IEEE+International+Conference+on+Advanced+Information+Networking+and+Applications&rft.aulast=Dillon%2C+T.%3B+Wu%2C+C.%3B+Chang%2C+E.&rft.au=Dillon%2C+T.%3B+Wu%2C+C.%3B+Chang%2C+E.&rft.date=2010&rft.volume=2010&rft.pages=27%E2%80%9333&rft_id=info:doi\/10.1109%2FAINA.2010.187&rfr_id=info:sid\/en.wikipedia.org:Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BishopIntro05-15\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BishopIntro05_15-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Bishop, M. (2005). <i>Introduction to Computer Security<\/i>. Addison=Wesley. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a> 9780321247445.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=book&rft.btitle=Introduction+to+Computer+Security&rft.aulast=Bishop%2C+M.&rft.au=Bishop%2C+M.&rft.date=2005&rft.pub=Addison%3DWesley&rft.isbn=9780321247445&rfr_id=info:sid\/en.wikipedia.org:Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WengCompet14-16\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-WengCompet14_16-0\" rel=\"external_link\">16.0<\/a><\/sup> <sup><a href=\"#cite_ref-WengCompet14_16-1\" rel=\"external_link\">16.1<\/a><\/sup> <sup><a href=\"#cite_ref-WengCompet14_16-2\" rel=\"external_link\">16.2<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Weng, F.; Hung, M.-C. (2014). \"Competition and Challenge on Adopting Cloud ERP\". <i>International Journal of Innovation, Management and Technology<\/i> <b>5<\/b> (4): 309-313. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.7763%2FIJIMT.2014.V5.531\" target=\"_blank\">10.7763\/IJIMT.2014.V5.531<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Competition+and+Challenge+on+Adopting+Cloud+ERP&rft.jtitle=International+Journal+of+Innovation%2C+Management+and+Technology&rft.aulast=Weng%2C+F.%3B+Hung%2C+M.-C.&rft.au=Weng%2C+F.%3B+Hung%2C+M.-C.&rft.date=2014&rft.volume=5&rft.issue=4&rft.pages=309-313&rft_id=info:doi\/10.7763%2FIJIMT.2014.V5.531&rfr_id=info:sid\/en.wikipedia.org:Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-JohanssonCloud15-17\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-JohanssonCloud15_17-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Johansson, B.; Alajbegovic, A.; Alexopoulo, V.; Desalermos, A. (2015). \"Cloud ERP Adoption Opportunities and Concerns: The Role of Organizational Size\". <i>Proceedings from the 48th Hawaii International Conference on System Sciences (HICSS), 2015<\/i> <b>2015<\/b>: 4211-4219. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FHICSS.2015.504\" target=\"_blank\">10.1109\/HICSS.2015.504<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Cloud+ERP+Adoption+Opportunities+and+Concerns%3A+The+Role+of+Organizational+Size&rft.jtitle=Proceedings+from+the+48th+Hawaii+International+Conference+on+System+Sciences+%28HICSS%29%2C+2015&rft.aulast=Johansson%2C+B.%3B+Alajbegovic%2C+A.%3B+Alexopoulo%2C+V.%3B+Desalermos%2C+A.&rft.au=Johansson%2C+B.%3B+Alajbegovic%2C+A.%3B+Alexopoulo%2C+V.%3B+Desalermos%2C+A.&rft.date=2015&rft.volume=2015&rft.pages=4211-4219&rft_id=info:doi\/10.1109%2FHICSS.2015.504&rfr_id=info:sid\/en.wikipedia.org:Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HashizumeAnAnal13-18\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-HashizumeAnAnal13_18-0\" rel=\"external_link\">18.0<\/a><\/sup> <sup><a href=\"#cite_ref-HashizumeAnAnal13_18-1\" rel=\"external_link\">18.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Hashizume, K.; Rosado, D.; Fern\u00e1ndez-Medina, E.; Fernandez, E. (2013). \"An analysis of security issues for cloud computing\". <i>Journal of Internet Services and Applications<\/i> <b>4<\/b>: 5. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1186%2F1869-0238-4-5\" target=\"_blank\">10.1186\/1869-0238-4-5<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=An+analysis+of+security+issues+for+cloud+computing&rft.jtitle=Journal+of+Internet+Services+and+Applications&rft.aulast=Hashizume%2C+K.%3B+Rosado%2C+D.%3B+Fern%C3%A1ndez-Medina%2C+E.%3B+Fernandez%2C+E.&rft.au=Hashizume%2C+K.%3B+Rosado%2C+D.%3B+Fern%C3%A1ndez-Medina%2C+E.%3B+Fernandez%2C+E.&rft.date=2013&rft.volume=4&rft.pages=5&rft_id=info:doi\/10.1186%2F1869-0238-4-5&rfr_id=info:sid\/en.wikipedia.org:Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-KumarMigration12-19\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-KumarMigration12_19-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Kumar, V.; Garg, K.K. (2012). \"Migration of Services to the Cloud Environment: Challenges and Best Practices\". <i>International Journal of Computer Applications<\/i> <b>55<\/b> (1): 1\u20136. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.5120%2F8716-7105\" target=\"_blank\">10.5120\/8716-7105<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Migration+of+Services+to+the+Cloud+Environment%3A+Challenges+and+Best+Practices&rft.jtitle=International+Journal+of+Computer+Applications&rft.aulast=Kumar%2C+V.%3B+Garg%2C+K.K.&rft.au=Kumar%2C+V.%3B+Garg%2C+K.K.&rft.date=2012&rft.volume=55&rft.issue=1&rft.pages=1%E2%80%936&rft_id=info:doi\/10.5120%2F8716-7105&rfr_id=info:sid\/en.wikipedia.org:Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-PuthalCloud15-20\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-PuthalCloud15_20-0\" rel=\"external_link\">20.0<\/a><\/sup> <sup><a href=\"#cite_ref-PuthalCloud15_20-1\" rel=\"external_link\">20.1<\/a><\/sup> <sup><a href=\"#cite_ref-PuthalCloud15_20-2\" rel=\"external_link\">20.2<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Puthal, D.; Sahoo, B.; Mishra, S.; Swain, S. (2015). \"Cloud Computing Features, Issues, and Challenges: A Big Picture\". <i>Proceedings from the International Conference on Computational Intelligence and Networks (CINE), 2015<\/i> <b>2015<\/b>: 116-123. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1109%2FCINE.2015.31\" target=\"_blank\">10.1109\/CINE.2015.31<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Cloud+Computing+Features%2C+Issues%2C+and+Challenges%3A+A+Big+Picture&rft.jtitle=Proceedings+from+the+International+Conference+on+Computational+Intelligence+and+Networks+%28CINE%29%2C+2015&rft.aulast=Puthal%2C+D.%3B+Sahoo%2C+B.%3B+Mishra%2C+S.%3B+Swain%2C+S.&rft.au=Puthal%2C+D.%3B+Sahoo%2C+B.%3B+Mishra%2C+S.%3B+Swain%2C+S.&rft.date=2015&rft.volume=2015&rft.pages=116-123&rft_id=info:doi\/10.1109%2FCINE.2015.31&rfr_id=info:sid\/en.wikipedia.org:Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CastellinaSaaS11-21\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CastellinaSaaS11_21-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Castellina, N. (December 2011). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.meritsolutions.com\/resources\/whitepapers\/Aberdeen-Research-SaaS-Cloud-ERP-Trands-2011.pdf\" target=\"_blank\">\"SaaS and Cloud ERP Trends, Observations, and Performance 2011\"<\/a> (PDF). Aberdeen Group<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.meritsolutions.com\/resources\/whitepapers\/Aberdeen-Research-SaaS-Cloud-ERP-Trands-2011.pdf\" target=\"_blank\">http:\/\/www.meritsolutions.com\/resources\/whitepapers\/Aberdeen-Research-SaaS-Cloud-ERP-Trands-2011.pdf<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=SaaS+and+Cloud+ERP+Trends%2C+Observations%2C+and+Performance+2011&rft.atitle=&rft.aulast=Castellina%2C+N.&rft.au=Castellina%2C+N.&rft.date=December+2011&rft.pub=Aberdeen+Group&rft_id=http%3A%2F%2Fwww.meritsolutions.com%2Fresources%2Fwhitepapers%2FAberdeen-Research-SaaS-Cloud-ERP-Trands-2011.pdf&rfr_id=info:sid\/en.wikipedia.org:Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-AkandeManag13-22\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-AkandeManag13_22-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Akande, A.O.; April, N.A.; Van Belle, J.-P. (2013). \"Management Issues with Cloud Computing\". <i>Proceedings of the Second International Conference on Innovative Computing and Cloud Computing<\/i> <b>2013<\/b>: 119\u2013124. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1145%2F2556871.2556899\" target=\"_blank\">10.1145\/2556871.2556899<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Management+Issues+with+Cloud+Computing&rft.jtitle=Proceedings+of+the+Second+International+Conference+on+Innovative+Computing+and+Cloud+Computing&rft.aulast=Akande%2C+A.O.%3B+April%2C+N.A.%3B+Van+Belle%2C+J.-P.&rft.au=Akande%2C+A.O.%3B+April%2C+N.A.%3B+Van+Belle%2C+J.-P.&rft.date=2013&rft.volume=2013&rft.pages=119%E2%80%93124&rft_id=info:doi\/10.1145%2F2556871.2556899&rfr_id=info:sid\/en.wikipedia.org:Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-PengCloud15-23\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-PengCloud15_23-0\" rel=\"external_link\">23.0<\/a><\/sup> <sup><a href=\"#cite_ref-PengCloud15_23-1\" rel=\"external_link\">23.1<\/a><\/sup> <sup><a href=\"#cite_ref-PengCloud15_23-2\" rel=\"external_link\">23.2<\/a><\/sup> <sup><a href=\"#cite_ref-PengCloud15_23-3\" rel=\"external_link\">23.3<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Peng, G.C.A.; Gala, C. (2014). \"Cloud Erp: A New Dilemma to Modern Organisations?\". <i>Journal of Computer Information Systems<\/i> <b>54<\/b> (4): 22\u201330. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1080%2F08874417.2014.11645719\" target=\"_blank\">10.1080\/08874417.2014.11645719<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Cloud+Erp%3A+A+New+Dilemma+to+Modern+Organisations%3F&rft.jtitle=Journal+of+Computer+Information+Systems&rft.aulast=Peng%2C+G.C.A.%3B+Gala%2C+C.&rft.au=Peng%2C+G.C.A.%3B+Gala%2C+C.&rft.date=2014&rft.volume=54&rft.issue=4&rft.pages=22%E2%80%9330&rft_id=info:doi\/10.1080%2F08874417.2014.11645719&rfr_id=info:sid\/en.wikipedia.org:Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-SubashiniASurv11-24\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-SubashiniASurv11_24-0\" rel=\"external_link\">24.0<\/a><\/sup> <sup><a href=\"#cite_ref-SubashiniASurv11_24-1\" rel=\"external_link\">24.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Subashini, S.; Kavitha, V. (2011). \"A survey on security issues in service delivery models of cloud computing\". <i>Journal of Network and Computer Applications<\/i> <b>34<\/b> (1): 1\u201311. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.jnca.2010.07.006\" target=\"_blank\">10.1016\/j.jnca.2010.07.006<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+survey+on+security+issues+in+service+delivery+models+of+cloud+computing&rft.jtitle=Journal+of+Network+and+Computer+Applications&rft.aulast=Subashini%2C+S.%3B+Kavitha%2C+V.&rft.au=Subashini%2C+S.%3B+Kavitha%2C+V.&rft.date=2011&rft.volume=34&rft.issue=1&rft.pages=1%E2%80%9311&rft_id=info:doi\/10.1016%2Fj.jnca.2010.07.006&rfr_id=info:sid\/en.wikipedia.org:Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-FauscetteERP13-25\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-FauscetteERP13_25-0\" rel=\"external_link\">25.0<\/a><\/sup> <sup><a href=\"#cite_ref-FauscetteERP13_25-1\" rel=\"external_link\">25.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation web\">Fauscette, M. (December 2013). <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/go.oracle.com\/LP=1093?elqCampaignId=2026\" target=\"_blank\">\"ERP in the Cloud and the Modern Business\"<\/a>. Oracle<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/go.oracle.com\/LP=1093?elqCampaignId=2026\" target=\"_blank\">https:\/\/go.oracle.com\/LP=1093?elqCampaignId=2026<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 23 January 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=ERP+in+the+Cloud+and+the+Modern+Business&rft.atitle=&rft.aulast=Fauscette%2C+M.&rft.au=Fauscette%2C+M.&rft.date=December+2013&rft.pub=Oracle&rft_id=https%3A%2F%2Fgo.oracle.com%2FLP%3D1093%3FelqCampaignId%3D2026&rfr_id=info:sid\/en.wikipedia.org:Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ClarkBeyond14-26\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-ClarkBeyond14_26-0\" rel=\"external_link\">26.0<\/a><\/sup> <sup><a href=\"#cite_ref-ClarkBeyond14_26-1\" rel=\"external_link\">26.1<\/a><\/sup> <sup><a href=\"#cite_ref-ClarkBeyond14_26-2\" rel=\"external_link\">26.2<\/a><\/sup> <sup><a href=\"#cite_ref-ClarkBeyond14_26-3\" rel=\"external_link\">26.3<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation web\">Clark, N.; Dawson, D.; Heard, K.; Manohar, M. (2014). <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.strategyand.pwc.com\/reports\/beyond-erp\" target=\"_blank\">\"Beyond ERP: New Technology, new options\"<\/a>. Booz & Company<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.strategyand.pwc.com\/reports\/beyond-erp\" target=\"_blank\">https:\/\/www.strategyand.pwc.com\/reports\/beyond-erp<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 23 January 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Beyond+ERP%3A+New+Technology%2C+new+options&rft.atitle=&rft.aulast=Clark%2C+N.%3B+Dawson%2C+D.%3B+Heard%2C+K.%3B+Manohar%2C+M.&rft.au=Clark%2C+N.%3B+Dawson%2C+D.%3B+Heard%2C+K.%3B+Manohar%2C+M.&rft.date=2014&rft.pub=Booz+%26+Company&rft_id=https%3A%2F%2Fwww.strategyand.pwc.com%2Freports%2Fbeyond-erp&rfr_id=info:sid\/en.wikipedia.org:Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WaligumImpact08-27\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WaligumImpact08_27-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Waligum, T. (14 August 2008). <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.infoworld.com\/article\/2652900\/applications\/impact-of-saas-on-the-enterprise-erp-market.html\" target=\"_blank\">\"Impact of SaaS on the enterprise ERP market\"<\/a>. <i>InfoWorld<\/i>. IDG Communications, Inc<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.infoworld.com\/article\/2652900\/applications\/impact-of-saas-on-the-enterprise-erp-market.html\" target=\"_blank\">https:\/\/www.infoworld.com\/article\/2652900\/applications\/impact-of-saas-on-the-enterprise-erp-market.html<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 23 January 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Impact+of+SaaS+on+the+enterprise+ERP+market&rft.atitle=InfoWorld&rft.aulast=Waligum%2C+T.&rft.au=Waligum%2C+T.&rft.date=14+August+2008&rft.pub=IDG+Communications%2C+Inc&rft_id=https%3A%2F%2Fwww.infoworld.com%2Farticle%2F2652900%2Fapplications%2Fimpact-of-saas-on-the-enterprise-erp-market.html&rfr_id=info:sid\/en.wikipedia.org:Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BinuASecurity12-28\"><span class=\"mw-cite-backlink\">\u2191 <sup><a href=\"#cite_ref-BinuASecurity12_28-0\" rel=\"external_link\">28.0<\/a><\/sup> <sup><a href=\"#cite_ref-BinuASecurity12_28-1\" rel=\"external_link\">28.1<\/a><\/sup><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Binu, S.; Meenakumari, J. (2012). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ijcse.com\/ijcse-issue.html?issue=20120304\" target=\"_blank\">\"A security framework for an enterprise system on cloud\"<\/a>. <i>Indian Journal of Computer Science and Engineering<\/i> <b>3<\/b> (4): 548\u2013552<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.ijcse.com\/ijcse-issue.html?issue=20120304\" target=\"_blank\">http:\/\/www.ijcse.com\/ijcse-issue.html?issue=20120304<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+security+framework+for+an+enterprise+system+on+cloud&rft.jtitle=Indian+Journal+of+Computer+Science+and+Engineering&rft.aulast=Binu%2C+S.%3B+Meenakumari%2C+J.&rft.au=Binu%2C+S.%3B+Meenakumari%2C+J.&rft.date=2012&rft.volume=3&rft.issue=4&rft.pages=548%E2%80%93552&rft_id=http%3A%2F%2Fwww.ijcse.com%2Fijcse-issue.html%3Fissue%3D20120304&rfr_id=info:sid\/en.wikipedia.org:Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-KumbharTheComp12-29\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-KumbharTheComp12_29-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Kumbhar, N.N.; Chaudhari, V.V.; Badhe, M.A. (2012). \"The Comprehensive Approach for Data Security in Cloud Computing: A Survey\". <i>International Journal of Computer Applications<\/i> <b>39<\/b> (18): 23\u201329. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.5120%2F5080-7433\" target=\"_blank\">10.5120\/5080-7433<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+Comprehensive+Approach+for+Data+Security+in+Cloud+Computing%3A+A+Survey&rft.jtitle=International+Journal+of+Computer+Applications&rft.aulast=Kumbhar%2C+N.N.%3B+Chaudhari%2C+V.V.%3B+Badhe%2C+M.A.&rft.au=Kumbhar%2C+N.N.%3B+Chaudhari%2C+V.V.%3B+Badhe%2C+M.A.&rft.date=2012&rft.volume=39&rft.issue=18&rft.pages=23%E2%80%9329&rft_id=info:doi\/10.5120%2F5080-7433&rfr_id=info:sid\/en.wikipedia.org:Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-RuivoTheERP15-30\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-RuivoTheERP15_30-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Ruivo, P.; Rodrigues, J.; Oliveira, T. (2015). \"The ERP Surge of Hybrid Models - An Exploratory Research into Five and Ten Years Forecast\". <i>Procedia Computer Science<\/i> <b>64<\/b>: 594\u2013600. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.procs.2015.08.572\" target=\"_blank\">10.1016\/j.procs.2015.08.572<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+ERP+Surge+of+Hybrid+Models+-+An+Exploratory+Research+into+Five+and+Ten+Years+Forecast&rft.jtitle=Procedia+Computer+Science&rft.aulast=Ruivo%2C+P.%3B+Rodrigues%2C+J.%3B+Oliveira%2C+T.&rft.au=Ruivo%2C+P.%3B+Rodrigues%2C+J.%3B+Oliveira%2C+T.&rft.date=2015&rft.volume=64&rft.pages=594%E2%80%93600&rft_id=info:doi\/10.1016%2Fj.procs.2015.08.572&rfr_id=info:sid\/en.wikipedia.org:Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ColumbusFive15-31\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ColumbusFive15_31-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Columbus, L. (27 January 2015). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.forbes.com\/sites\/louiscolumbus\/2015\/01\/27\/five-catalysts-accelerating-cloud-erp-growth-in-2015\" target=\"_blank\">\"Five Catalysts Accelerating Cloud ERP Growth In 2015\"<\/a>. <i>Forbes<\/i><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.forbes.com\/sites\/louiscolumbus\/2015\/01\/27\/five-catalysts-accelerating-cloud-erp-growth-in-2015\" target=\"_blank\">http:\/\/www.forbes.com\/sites\/louiscolumbus\/2015\/01\/27\/five-catalysts-accelerating-cloud-erp-growth-in-2015<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 23 January 2017<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Five+Catalysts+Accelerating+Cloud+ERP+Growth+In+2015&rft.atitle=Forbes&rft.aulast=Columbus%2C+L.&rft.au=Columbus%2C+L.&rft.date=27+January+2015&rft_id=http%3A%2F%2Fwww.forbes.com%2Fsites%2Flouiscolumbus%2F2015%2F01%2F27%2Ffive-catalysts-accelerating-cloud-erp-growth-in-2015&rfr_id=info:sid\/en.wikipedia.org:Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<\/ol><\/div>\n<h2><span class=\"mw-headline\" id=\"Notes\">Notes<\/span><\/h2>\n<p>This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added. The original article lists references alphabetically, but this version \u2014 by design \u2014 lists them in order of appearance.\n<\/p>\n<!-- \nNewPP limit report\nCached time: 20181214185728\nCache expiry: 86400\nDynamic content: false\nCPU time usage: 0.707 seconds\nReal time usage: 0.740 seconds\nPreprocessor visited node count: 22993\/1000000\nPreprocessor generated node count: 38198\/1000000\nPost\u2010expand include size: 142772\/2097152 bytes\nTemplate argument size: 51454\/2097152 bytes\nHighest expansion depth: 15\/40\nExpensive parser function count: 0\/100\n-->\n\n<!-- \nTransclusion expansion time report (%,ms,calls,template)\n100.00% 704.228 1 - -total\n 82.88% 583.693 1 - Template:Reflist\n 71.66% 504.655 31 - Template:Citation\/core\n 44.91% 316.237 18 - Template:Cite_journal\n 25.38% 178.725 11 - Template:Cite_web\n 9.05% 63.702 1 - Template:Infobox_journal_article\n 8.70% 61.251 1 - Template:Infobox\n 6.10% 42.961 2 - Template:Cite_book\n 5.23% 36.849 80 - Template:Infobox\/row\n 4.41% 31.074 18 - Template:Citation\/identifier\n-->\n\n<!-- Saved in parser cache with key limswiki:pcache:idhash:10423-0!*!0!!en!5!* and timestamp 20181214185728 and revision id 32539\n -->\n<\/div><div class=\"printfooter\">Source: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues\">https:\/\/www.limswiki.org\/index.php\/Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues<\/a><\/div>\n\t\t\t\t\t\t\t\t\t\t<!-- end content -->\n\t\t\t\t\t\t\t\t\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t\t<!-- end of the left (by default at least) column -->\n\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t\t\n\t\t<\/div>\n\t\t\n\n<\/body>","f83633bd19906c97fe01cf5c6de8eb6e_images":["https:\/\/www.limswiki.org\/images\/f\/f6\/Fig1_Saa_JofInfoSysEngMan2017_2-4.png","https:\/\/www.limswiki.org\/images\/a\/a3\/Fig2_Saa_JofInfoSysEngMan2017_2-4.png","https:\/\/www.limswiki.org\/images\/f\/f8\/Fig3_Saa_JofInfoSysEngMan2017_2-4.png"],"f83633bd19906c97fe01cf5c6de8eb6e_timestamp":1544813848,"b4dc927afe66d41e039d02b8df4b895f_type":"article","b4dc927afe66d41e039d02b8df4b895f_title":"Method-centered digital communities on protocols.io for fast-paced scientific innovation (Kindler et al. 2017)","b4dc927afe66d41e039d02b8df4b895f_url":"https:\/\/www.limswiki.org\/index.php\/Journal:Method-centered_digital_communities_on_protocols.io_for_fast-paced_scientific_innovation","b4dc927afe66d41e039d02b8df4b895f_plaintext":"\n\n\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\n\n\t\t\t\tJournal:Method-centered digital communities on protocols.io for fast-paced scientific innovation\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tFrom LIMSWiki\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\tJump to: navigation, search\n\n\t\t\t\t\t\n\t\t\t\t\tFull article title\n \nMethod-centered digital communities on protocols.io for fast-paced scientific innovationJournal\n \nF1000ResearchAuthor(s)\n \nKindler, Lori; Stoliartchouk, Alexei; Teytelman, Leonid; Hurwitz, Bonnie L.Author affiliation(s)\n \nUniversity of Arizona, protocols.ioPrimary contact\n \nEmail: bhurwitz at email dot arizona dot eduYear published\n \n2017Volume and issue\n \n5Page(s)\n \n2271DOI\n \n10.12688\/f1000research.9453.2ISSN\n \n2046-1402Distribution license\n \nCreative Commons Attribution 4.0 InternationalWebsite\n \nhttps:\/\/f1000research.com\/articles\/5-2271\/v2Download\n \nhttps:\/\/f1000research.com\/articles\/5-2271\/v2\/pdf (PDF)\n\nContents\n\n1 Abstract \n2 Introduction \n3 Protocols.io: A platform to enable methods discussion and dissemination \n\n3.1 Introducing VERVENet: The Viral Ecology Research and Virtual Exchange Network \n\n\n4 Methods \n\n4.1 Creating a user profile in protocols.io \n4.2 Adding protocols in protocols.io \n4.3 Developing groups in protocols.io \n4.4 Finding protocols \n4.5 Providing feedback on protocols \n\n\n5 Use case: VERVE Net: Virus Ecology Research and Virtual Exchange Network \n\n5.1 Molecular and bioinformatics protocols \n5.2 Protocol collections \n5.3 Groups and sharing \n5.4 Literature recommendations \n5.5 Live online discussion forum \n5.6 Platform infrastructure and interoperability \n5.7 Content and adoption \n\n\n6 Discussion and conclusions \n7 Declarations and acknowledgements \n\n7.1 Data and software availability \n7.2 Author contributions \n7.3 Competing interests \n7.4 Grant information \n7.5 Acknowledgements \n\n\n8 References \n9 Notes \n\n\n\nAbstract \nThe internet has enabled online social interaction for scientists beyond physical meetings and conferences. Yet despite these innovations in communication, dissemination of methods is often relegated to just academic publishing. Further, these methods remain static, with subsequent advances published elsewhere and unlinked. For communities undergoing fast-paced innovation, researchers need new capabilities to share, obtain feedback, and publish methods at the forefront of scientific development. For example, a renaissance in virology is now underway given the new metagenomic methods to sequence viral DNA directly from an environment. Metagenomics makes it possible to \u201csee\u201d natural viral communities that could not be previously studied through culturing methods. Yet, the knowledge of specialized techniques for the production and analysis of viral metagenomes remains in a subset of labs. This problem is common to any community using and developing emerging technologies and techniques. We developed new capabilities to create virtual communities in protocols.io, an open-access platform for disseminating protocols and knowledge at the forefront of scientific development. To demonstrate these capabilities, we present a virology community forum called VERVENet. These new features allow virology researchers to share protocols and their annotations and optimizations; connect with the broader virtual community to share knowledge, job postings, conference announcements through a common online forum; and discover the current literature through personalized recommendations to promote discussion of cutting edge research. Virtual communities in protocols.io enhance a researcher\u2019s ability to discuss and share protocols, connect with fellow community members, and learn about new and innovative research in the field. The web-based software for developing virtual communities is free to use on protocols.io. Data are available through public APIs at protocols.io.\n\nIntroduction \nThe internet has enabled online social interaction for scientists beyond physical meetings and conferences. Twitter, Facebook, and ResearchGate[1][2][3] provide valuable online forums that many researchers use to share knowledge. At the same time, academic publishing remains time consuming and inefficient for communicating methodology. Protocols are often relegated to supplementary information, if shared at all. There is no good mechanism for easily discussing, troubleshooting, and improving published or unpublished techniques.\nThis need is even more apparent in emerging fields such as viral ecology where laboratory, field, and bioinformatics methods are being actively developed.[4] For example, new metagenomic techniques to sequence viral DNA directly from environmental samples has led to rapid advances in both molecular and bioinformatic protocols.[5] These protocols, however, are highly specialized and generally used in a few highly proficient labs because: (i) viral metagenomes (viromes) are difficult to produce due to low quantities of DNA and refined isolation and purification methods, (ii) the vast majority of viral sequences are unknown (usually >90%[6]) complicating bioinformatics analyses, and (iii) newly emerging comparative and functional metagenomic analyses exist but require on-going community refinement and development.\nGiven the experimental nature of methods, the virology community has expressed a need to foster discussions about these protocols towards improved methodologies and increasing connectivity and collaboration among researchers.[7] The challenge is to develop a method-centered collaborative platform that recapitulates the functionality of a scientific meeting - a digital community for connecting with fellow researchers to share and discover state-of-the-art knowledge.\nHere we describe new capabilities in protocols.io (http:\/\/www.protocols.io), an open-access platform, to create virtual communities for disseminating protocols and knowledge at the forefront of scientific development. To demonstrate these capabilities, we describe a viral ecology community forum called VERVENet (https:\/\/www.protocols.io\/groups\/verve-net) that strives to increase connectivity and knowledge dissemination in viral ecology research at all levels, from undergraduates to accomplished viral ecologists. These new community features enhance a researcher\u2019s ability to discuss and share protocols, connect with fellow community members, and learn about new and innovative research in the field. The web-based software for developing virtual communities is free for use on protocols.io and is further described here.\n\nProtocols.io: A platform to enable methods discussion and dissemination \nProtocols.io is a free service for industry and academic scientists to share or maintain private protocols for research.[8] The driving force behind software development is to provide a mechanism for scientists to share improvements and corrections to protocols so that others are not continuously re-discovering knowledge that scientists have not had the time or wear-with-all to publish. Protocols.io provides a free, up-to-date, crowd-sourced protocol repository for the life science community. This software is available as a web-based platform or smart phone application[9][10] to enable mobile solutions for research and bench work. Per best practices in mobile computing, these apps offer extensive options and control of push notifications. In fall 2014, protocols.io offered a well-developed platform for users to share molecular methods; however, no capabilities were in place to share bioinformatics and other methods among groups. To this end, the viral ecology community teamed up with protocols.io to create new group capabilities, develop bioinformatics protocols, and enhance discussion forums for news, methods, and literature.\n\nIntroducing VERVENet: The Viral Ecology Research and Virtual Exchange Network \nThe Viral Ecology Research and Virtual Exchange Network (VERVENet) is a collaboration between the University of Arizona and protocols.io to deliver an online forum for the virology community. To enable this forum, new group functionality was built into protocols.io to promote scientific communication and collaboration. Specifically, group features were developed on top of existing capabilities to share molecular methods in order to (i) share protocols and their annotations and optimizations; (ii) fuel connectivity among viral ecology researchers for sharing data sets, knowledge, job postings, and conference announcements through a common online forum called VERVENet; and (iii) facilitate literature discovery through personalized recommendations to promote discussion on cutting edge viral ecology research. Through developing these interconnected resources in protocols.io for virtual communities, we developed a \u201cgo-to\u201d site for viral ecology research.[11] Moreover, these tools are broadly useful to any community or individual lab for promoting scientific inquiry, reproduction of results, dissemination of protocols, and re-use. Specifically, new forums can be created in a matter of minutes to enable connectivity among groups of any size, with tools described here under use cases. The VERVE Net forum is a place to discuss newly emerging methods in viral ecology for any kind of data such as omics or image datasets. However, while images, videos, and tables can be added to protocols\/steps to enhance the description of methods, the protocols.io platform is not a data storage site.\n\nMethods \nCreating a user profile in protocols.io \nUsers can view protocols and all public content anonymously, but to interact with the platform, registration is necessary. Registration is quick, as only email and password are required to create an account; however, users are encouraged to create profiles containing their name, website, affiliation, and research interests. Others can search and find a user based on name or keywords. Moreover, user profiles are attached to any material on protocols.io that the user posts publicly. User profiles also contain a field for ORCID[12] so that researchers can tie their profile back to a common identifier and highlight their work in the field. Researchers can also include a biography that describes how they got into the field and what intrigues them. Thus, profiles allow users to add in their own content, rather than simply browse existing content.\n\nAdding protocols in protocols.io \nAfter registration, new protocols can be entered (Figure 1). By default, all protocols are private and can be shared with individual collaborators or any of the groups. The protocols are structured with tabs for the \u201csteps,\u201d \u201cdescription,\u201d \u201cguidelines,\u201d and \u201ccomments.\u201d When entering the steps, a list of components that can be added to the steps is located on the far right and allows a clear detailing of wetlab or computational portions of the method. Related steps of the protocol can also be easily grouped together into sections such as \"preparation,\" \"DNA extraction,\" and \"analysis,\" etc. Steps may be entered one by one by typing into the text box or by pasting steps from another file, facilitating import of existing protocols. For each step, annotations can be added to make notes on specific steps. Once complete, the protocol can be run in a step-by-step format.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 1. Entering a protocol in protocols.io. Protocols are entered by providing a broad description, information about authors, any prior materials or background required, and detailed step-by-step methods to implement the protocol. Protocols can remain private to an individual or group, or released to the public.\n\n\n\nOnce a protocol has been created, there are several options for sharing it with collaborators or a group. To make the protocol publicly viewable, one will need to click the \"publish\" button. A protocol can be reassigned to another individual with a protocols.io account. For ongoing development and changes to adding and using protocols, see the tutorials (http:\/\/www.protocols.io\/help) at protocols.io.[13]\n\nDeveloping groups in protocols.io \nTo create a group, one must have an account and be logged in. For example, here we describe the VERVE Net group; however, it is possible to create any group. To create a group, users can click on their personal icon in the upper right-hand corner and select \u201c+ new group.\u201d They will be prompted to enter a group name, image, description of the group, research interests, external website address, physical location of the group, and an affiliation. The user will also decide if the group is open to anyone, by invitation only, or open to membership requests. In addition, the user can choose if the group is visible to others or private. Users are able to invite members into their group and control the privileges of their members. Moreover, as the owner of a group, the user is able to invite other subgroups, such as in the VERVE example, where individual labs are subgroups.\n\nFinding protocols \nProtocols on protocols.io can be tagged to allow users to quickly find protocols or collections of protocols in a particular area of interest. Users can also find protocols or other content using the global search at the top of each page, allowing users to search within the entire forum or specific sections of the forum.\n\nProviding feedback on protocols \nprotocols.io offers three methods for feedback directly from users: twitter, email to protocols developers (info@protocols.io), and through a feedback forum where users and developers alike can respond. These comments are then used to fuel future development. Further, protocols.io recently initiated an ambassadors program where power users (usually graduate students or postdocs) that are directly connected to diverse communities provide feedback from a user perspective. Thus, future development is guided by community input from these sources.\n\nUse case: VERVE Net: Virus Ecology Research and Virtual Exchange Network \nMolecular and bioinformatics protocols \nOften, detailed \u201ctricks of the trade\u201d associated with lab, field, and bioinformatics protocols are not well-described in publications, and at best they are stashed in supplemental materials. Practical information associated with running these protocols under varied conditions cannot be curated, documented, or discussed among students, postdocs, technicians, and faculty working in virology. Moreover, knowledge on when to use a particular version of a given protocol is not easily captured. Protocols.io provides a flexible mechanism wherein protocols can be documented in a step-wise fashion to easily pivot between molecular and bioinformatics methodologies, link to useful websites or code in Github[14], as well as reference manuals or original source materials for protocols, as exemplified in the VERVENet forum.\nThe user entering the protocol may not necessarily be the author of the original method. However, by providing links to the primary work, users can attribute credit to the original author while at the same time adding their own updates to the method either while they enter it, or at a later time. Further, other users have the capability to add notes and warnings to existing protocols in protocols.io. This functionality includes a mechanism to email the protocol author for protocol troubleshooting. Corrections and updates made by the protocol authors and users automatically trigger notifications emailed to researchers who use that protocol. Lastly, users can \"fork\" or copy existing protocols for further refinement or alternate uses while still maintaining links back to the original for credit and reference. As such, the protocol is a living document for the community to reuse and continually refine.\nFor publication, authors have the option to enter detailed methods into protocols.io, issue a digital object identifier (DOI[15]), and link to the protocols.io record from the methods section. This practice is now being encouraged in journal submissions and by funding agencies.\n\nProtocol collections \nBecause protocols are often used in conjunction with other protocols, protocols.io has the capability to link protocols into user-defined workflows. This is particularly important for publications that may use a collection of varied protocols (field, lab, and bioinformatics) that are derived from many sources (protocols from the user or other users). In providing a collection of protocols associated with a publication, the authors enable their work to be replicated, easy-to-follow, and transparent to other members of the community in a way that can be referenced and cited. For example, a collection of protocols derived from a recent publication on the human skin double stranded DNA skin virome is available in VERVENet.[16][17] Thus, collections provide a mechanism for furthering open-science efforts.\nProtocol collections also provide a mechanism to learn by example for early career scientists or those branching into a new area of scientific inquiry. In particular, detailed protocols associated with a toolkit or workshop \u2014 where multimedia options such as slides, video, or links to virtual machines with example datasets and code \u2014 can be included.[18][19] This is particularly important for bioinformatics protocols that often include multiple programs and steps in an analysis for a given publication. Further, individual tools may have a collection of protocols that describe specific use-cases, example datasets, and varied options that they may wish to convey to their users.\n\nGroups and sharing \nIndividual members can form groups, where the owner has the ability to choose the level of accessibility for fellow members. The groups can share literature recommendations, discussions, protocols, news, events, and job opportunities (Figure 2). Subgroups can form under the umbrella of a larger group with a common interest. This subgroup\/supergroup relationship allows smaller group activities to be shared with a larger virtual community with common interests. In the case of VERVENet, this supergroup links the broader research in virology with the subgroups of individual labs and more specific research interests such as plant viruses.\n\r\n\n\n\n\n\n\n\n\n\n\n Figure 2. The VERVENet group in protocols.io. Groups in protocols.io display information about the group objectives, members, subgroups, the group library and literature recommendations, group discussions, news, jobs, and events. Groups have the capacity to control access, from making groups and content public and allowing anyone to join, to restricted content and invitation only membership. VERVENet is an example of a public forum for virology.\n\n\n\nLiterature recommendations \nEach of the groups includes a literature recommendation system. This algorithm provides personalized publication recommendations based on a library from a user or group. This algorithm is used to develop libraries for viral ecology user groups, that will continually recommend new publications based on growing reading lists from individual users who are part of the group. This functionality allows virologists to make their reading lists public, therefore helping new scientists joining the field in their topic area. The libraries from sub-groups also fuel the shared public reading list within the VERVENet group, therefore creating enhanced fluidity and cross-posted content between the groups.\n\nLive online discussion forum \nEach of the groups in protocols.io contains a live online discussion. Discussions can be generated directly on the discussion tab or cross-posted from discussions on specific protocols, news, or literature. Each of the discussions can reference outside websites, manuals, or online resources. This discussion forum enables users to discuss tips and tricks for specific protocols, review reagents linked to particular protocols, and reference outside resources that were not included in the original protocol.\nProtocols.io also includes \u201cjournal-club\u201d capabilities to enable on-line discussions of published research by researchers and authors. Other unique features in protocols.io include a career advice forum with a panel of mentors[20] and a \u201cbehind the article\u201d essay forum.[21] These communication forums allow researchers to share their stories about how papers, protocols, or research efforts came about, that are both interesting to the community and informative for early career scientists.\n\nPlatform infrastructure and interoperability \nComputers, tablets, and smart phones are becoming fundamental tools for scientists today. Furthermore, social networking and shared cyberinfrastructures are offering powerful new mechanisms to connect communities and science from across the world. Protocols.io leverages these powerful new tools and software capabilities to provide an online forum for viral ecology research to connect and share knowledge and resources. All components of protocols.io and the VERVENet forum are mobile-friendly and interoperable for use on diverse devices in the lab, on the desktop, or on the go.\n\nContent and adoption \nThe VERVENet group currently contains 365 live protocols, 212 news articles, and 59 job opportunities. There is an event calendar that contains workshops and conferences specific to virology through the fall of 2016. We have 231 members and 22 subgroups. Examples of subgroups include the Plant Virus Ecology Network, which originally formed in 2007[22]; the Chlorovirus Group, ECOGEO[23]; and 18 individual labs. The International Society for Viruses of Microorganisms has listed VERVE Net on their website as a resource.[24]\n\nDiscussion and conclusions \nThe primary goal of new group functionality in protocols.io is to provide a robust web application for sharing up-to-date protocols, literature, and community features (news, jobs, discussions). This work is exemplified in VERVE Net, a virtual community forum for virology. Fundamental to this goal is the ability for researchers to establish groups based on similar interests and share knowledge, without apriori knowledge of key members in a given field.\nWe have designed an infrastructure that has multiple entry points for establishing relationships among users, ranging from self-proclaimed groups or areas of interest to options to join groups maintained by others in an area of interest to the user fueled by related protocols or reading lists. Moreover, news feeds about funding opportunities, job postings, or collaborative research opportunities can be fine-tuned according to interest. These connections will allow the forum to evolve naturally given rapidly developing trends and new protocols. Protocols.io is open-access and is both free-to-read and free-to-publish. The revenue and sustainability model is based on the sale of data services to reagent vendors (most popular protocols, protocol improvements, and reagent-protocol links). Protocols.io also charges fees for private non-academic groups.\nProtocols.io is a central resource to connect, collaborate, share and innovate within virtual communities. The VERVENet forum demonstrates how this new group functionality allows researchers to promote scientific inquiry, reproduction of results, and dissemination and optimization of both molecular and bioinformatics protocols, as a virtual community.\n\nDeclarations and acknowledgements \nData and software availability \nProtocols.io and the VERVENet commuity forum are committed to open access for data content and interoperability. To that end, the content in protocols is available through an application programming interface (API) for advanced data mining, and no registration is required to view protocols, comments, or annotations. All public protocols are archived with CLOCKSS for long-term digital preservation.[25] Users will also be able to access public protocols.io mirrored at the Center for Open Science.\n\nAuthor contributions \nLK wrote the manuscript, tested the platform, added content, and provided feedback on features and functionality. AS developed the platform, tested and designed features and functionality. LT and BLH designed VERVENet, tested the system, provided feedback on features and functionality, and wrote the manuscript. All authors read and approved the final manuscript.\n\nCompeting interests \nLeonid Teytelman and Alexei Stoliartchouk are employees of protocols.io and both own equity in the company.\n\nGrant information \nThis work was funded by a grant to B.L.H. and L.T. from the Gordon Betty Moore Foundation (GBMF4733).\nThe funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.\n\nAcknowledgements \nWe would like to thank Celina Gomez and James Thornton for adding \u201cseed\u201d protocols into VERVENet, and Vladimir Frolov for development of the interface and group functionality.\n\nReferences \n\n\n\u2191 Ellison, N.B.; Steinfield, C.; Lampe, C. (2007). \"The Benefits of Facebook \u201cFriends:\u201d Social Capital and College Students\u2019 Use of Online Social Network Sites\". Journal of Computer-Mediated Communication 12 (4): 1143\u20131168. doi:10.1111\/j.1083-6101.2007.00367.x.   \n\n\u2191 Kwak, H.; Lee, C.; Park, H.; Moon, S. (2010). \"What is Twitter, a social network or a news media?\". Proceedings of the 19th International Conference on World Wide Web 2010: 591-600. doi:10.1145\/1772690.1772751.   \n\n\u2191 Thelwall, M.; Kousha, K. (2015). \"ResearchGate: Disseminating, communicating, and measuring Scholarship?\". Journal of the Association for Information Science and Technology 66 (5): 876\u2013889. doi:10.1002\/asi.23236.   \n\n\u2191 Weinbauer, M.G.; Rowe, J.M.; Wilhelm, S.W., ed. (2010). Manual of Aquatic Viral Ecology. American Society of Limnology and Oceanography. doi:10.4319\/mave.2010.978-0-9845591-0-7.   \n\n\u2191 Brum, J.R.; Sullivan, M.B. (2015). \"Rising to the challenge: Accelerated pace of discovery transforms marine virology\". Nature Reviews Microbiology 13 (3): 147-59. doi:10.1038\/nrmicro3404. PMID 25639680.   \n\n\u2191 Hurwitz, B.L.; Sullivan, M.B. (2013). \"The Pacific Ocean virome (POV): A marine viral metagenomic dataset and associated protein clusters for quantitative viral ecology\". PLoS One 8 (2): e57355. doi:10.1371\/journal.pone.0057355. PMC PMC3585363. PMID 23468974. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3585363 .   \n\n\u2191 \"Aquatic Viruses\". Facebook. 02 July 2014. https:\/\/www.facebook.com\/AquaticViruses\/posts\/760704383968498 . Retrieved 09 August 2016 .   \n\n\u2191 Teytelman, L.; Stoliartchouk, A. (2015). \"Protocols.io: Reducing the knowledge that perishes because we do not publish it\". Information Services & Use 35 (1\u20132): 109\u2013115. doi:10.3233\/ISU-150769.   \n\n\u2191 ZappyLab. \"protocols.io\". App Store. https:\/\/itunes.apple.com\/us\/app\/protocols.io\/id976303827 . Retrieved 21 March 2016 .   \n\n\u2191 ZappyLab. \"protocols.io\". Google Play. https:\/\/play.google.com\/store\/apps\/details?id=com.zappylab.protocols . Retrieved 21 March 2016 .   \n\n\u2191 \"VERVE Net\". protocols.io. https:\/\/www.protocols.io\/g\/verve-net . Retrieved 21 March 2016 .   \n\n\u2191 Haak, L.L.; Fenner, M.; Paglione, L. et al. (2012). \"ORCID: A system to uniquely identify researchers\". Learned Publishing 25 (4): 259\u2013264. doi:10.1087\/20120404.   \n\n\u2191 \"Explore protocols.io\". ZappyLab, Inc. https:\/\/www.protocols.io\/help\/explore . Retrieved 21 March 2016 .   \n\n\u2191 Dabbish, L.; Stuart, C.; Tsay, J.; Herbsleb, J. (2012). \"Social coding in GitHub: Transparency and collaboration in an open software repository\". Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work 2012: 1277-1286. doi:10.1145\/2145204.2145396.   \n\n\u2191 Paskin, N. (2009). \"Digital Object Identifier (DOI\u00ae) System\". In Bates, M.J.; Maack, M.N.. Encyclopedia of Library and Information Sciences (3rd ed.). CRC Press. pp. 1586\u201392. ISBN 9780849397110.   \n\n\u2191 Hannigan, G.D.; Meisel, J.S.; Tyldsley, A.S. et al. (2015). \"The human skin double-stranded DNA virome: Topographical and temporal diversity, genetic enrichment, and dynamic associations with the host microbiome\". mBio 6 (5): e01578-15. doi:10.1128\/mBio.01578-15. PMC PMC4620475. PMID 26489866. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4620475 .   \n\n\u2191 Hannigan, G.D.; Meisel, J.S.; Tyldsley, A.S. et al. (10 March 2016). \"The human skin double-stranded DNA virome: Topographical and temporal diversity, genetic enrichment, and dynamic associations with the host microbiome\". protocols.io. doi:10.17504\/protocols.io.ekubcww. https:\/\/www.protocols.io\/view\/The-Human-Skin-dsDNA-Virome-Topographical-and-Temp-ekubcww .   \n\n\u2191 Hurwitz, B. (24 November 2015). \"QIIME: Moving Pictures of the human microbiome\". protocols.io. doi:10.17504\/protocols.io.d5288d. https:\/\/www.protocols.io\/view\/QIIME-Moving-Pictures-of-the-human-microbiome-d5288d .   \n\n\u2191 Caporaso, J.G.; Kuczynski, J.; Stombaugh, J. et al. (2010). \"QIIME allows analysis of high-throughput community sequencing data\". Nature Methods 7 (5): 335-6. doi:10.1038\/nmeth.f.303. PMC PMC3156573. PMID 20383131. http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3156573 .   \n\n\u2191 \"Academic Career Forum\". protocols.io. https:\/\/www.protocols.io\/g\/academic-career-forum . Retrieved 21 March 2016 .   \n\n\u2191 \"Essays\". PubChase. https:\/\/www.pubchase.com\/essays?mostviewed . Retrieved 21 March 2016 .   \n\n\u2191 Malmstrom, C.M.; Melcher, U.; Bosque-P\u00e9rez, N.A.. \"The expanding field of plant virus ecology: historical foundations, knowledge gaps, and research directions\". Virus Research 159 (2): 84-94. doi:10.1016\/j.virusres.2011.05.010.   \n\n\u2191 Wood-Charlson, E.; DeLong, E.; Workshop I participants (24 November 2015). \"ECOGEO 2015 Workshop I Final Report\". EarthCube. https:\/\/www.earthcube.org\/document\/2015\/2015ecogeo-final-report . Retrieved 21 March 2016 .   \n\n\u2191 \"Resources\". International Society for Viruses of Microorganisms. http:\/\/www.isvm.org\/resources.html . Retrieved 21 March 2016 .   \n\n\u2191 \"protocols.io for developers\". protocols.io. https:\/\/www.protocols.io\/developers . Retrieved 21 March 2016 .   \n\n\nNotes \nThis presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added.\n\n\n\n\n\n\nSource: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Method-centered_digital_communities_on_protocols.io_for_fast-paced_scientific_innovation\">https:\/\/www.limswiki.org\/index.php\/Journal:Method-centered_digital_communities_on_protocols.io_for_fast-paced_scientific_innovation<\/a>\n\t\t\t\t\tCategories: LIMSwiki journal articles (added in 2017)LIMSwiki journal articles (all)LIMSwiki journal articles on data management and sharingLIMSwiki journal articles on informaticsLIMSwiki journal articles on software\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\n\t\t\n\t\t\tNavigation menu\n\t\t\t\t\t\n\t\t\tViews\n\n\t\t\t\n\t\t\t\t\n\t\t\t\tJournal\n\t\t\t\tDiscussion\n\t\t\t\tView source\n\t\t\t\tHistory\n\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\t\n\t\t\t\tPersonal tools\n\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\t\t\tLog in\n\t\t\t\t\t\t\t\t\t\t\t\t\tRequest account\n\t\t\t\t\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\t\n\t\tNavigation\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tMain page\n\t\t\t\t\t\t\t\t\t\t\tRecent changes\n\t\t\t\t\t\t\t\t\t\t\tRandom page\n\t\t\t\t\t\t\t\t\t\t\tHelp\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tSearch\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t \n\t\t\t\t\t\t\n\t\t\t\t\n\n\t\t\t\t\t\t\t\n\t\t\n\t\t\t\n\t\t\tTools\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tWhat links here\n\t\t\t\t\t\t\t\t\t\t\tRelated changes\n\t\t\t\t\t\t\t\t\t\t\tSpecial pages\n\t\t\t\t\t\t\t\t\t\t\tPermanent link\n\t\t\t\t\t\t\t\t\t\t\tPage information\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\tPrint\/export\n\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\tCreate a book\n\t\t\t\t\t\t\t\t\t\t\tDownload as PDF\n\t\t\t\t\t\t\t\t\t\t\tDownload as Plain text\n\t\t\t\t\t\t\t\t\t\t\tPrintable version\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\n\t\t\n\t\tSponsors\n\t\t\n\t\t\t \r\n\n\t\r\n\n\t\r\n\n\t\r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n \r\n\n\t\n\t\r\n\n\t\n\t\r\n\n\t\r\n\n\t\r\n\n\t\r\n\t\t\n\t\t\n\t\t\t\n\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t This page was last modified on 14 August 2018, at 16:57.\n\t\t\t\t\t\t\t\t\tThis page has been accessed 663 times.\n\t\t\t\t\t\t\t\t\tContent is available under a Creative Commons Attribution-ShareAlike 4.0 International License unless otherwise noted.\n\t\t\t\t\t\t\t\t\tPrivacy policy\n\t\t\t\t\t\t\t\t\tAbout LIMSWiki\n\t\t\t\t\t\t\t\t\tDisclaimers\n\t\t\t\t\t\t\t\n\t\t\n\t\t\n\t\t\n\n","b4dc927afe66d41e039d02b8df4b895f_html":"<body class=\"mediawiki ltr sitedir-ltr ns-206 ns-subject page-Journal_Method-centered_digital_communities_on_protocols_io_for_fast-paced_scientific_innovation skin-monobook action-view\">\n<div id=\"rdp-ebb-globalWrapper\">\n\t\t<div id=\"rdp-ebb-column-content\">\n\t\t\t<div id=\"rdp-ebb-content\" class=\"mw-body\" role=\"main\">\n\t\t\t\t<a id=\"rdp-ebb-top\"><\/a>\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t<h1 id=\"rdp-ebb-firstHeading\" class=\"firstHeading\" lang=\"en\">Journal:Method-centered digital communities on protocols.io for fast-paced scientific innovation<\/h1>\n\t\t\t\t\n\t\t\t\t<div id=\"rdp-ebb-bodyContent\" class=\"mw-body-content\">\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\t\t\t\t\n\n\t\t\t\t\t<!-- start content -->\n\t\t\t\t\t<div id=\"rdp-ebb-mw-content-text\" lang=\"en\" dir=\"ltr\" class=\"mw-content-ltr\">\n\n\n<h2><span class=\"mw-headline\" id=\"Abstract\">Abstract<\/span><\/h2>\n<p>The internet has enabled online social interaction for scientists beyond physical meetings and conferences. Yet despite these innovations in communication, dissemination of methods is often relegated to just academic publishing. Further, these methods remain static, with subsequent advances published elsewhere and unlinked. For communities undergoing fast-paced innovation, researchers need new capabilities to share, obtain feedback, and publish methods at the forefront of scientific development. For example, a renaissance in virology is now underway given the new metagenomic methods to sequence viral DNA directly from an environment. Metagenomics makes it possible to \u201csee\u201d natural viral communities that could not be previously studied through culturing methods. Yet, the knowledge of specialized techniques for the production and analysis of viral metagenomes remains in a subset of labs. This problem is common to any community using and developing emerging technologies and techniques. We developed new capabilities to create virtual communities in protocols.io, an open-access platform for disseminating protocols and knowledge at the forefront of scientific development. To demonstrate these capabilities, we present a virology community forum called VERVENet. These new features allow virology researchers to share protocols and their annotations and optimizations; connect with the broader virtual community to share knowledge, job postings, conference announcements through a common online forum; and discover the current literature through personalized recommendations to promote discussion of cutting edge research. Virtual communities in protocols.io enhance a researcher\u2019s ability to discuss and share protocols, connect with fellow community members, and learn about new and innovative research in the field. The web-based software for developing virtual communities is free to use on protocols.io. Data are available through public APIs at <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.protocols.io\/\" target=\"_blank\">protocols.io<\/a>.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Introduction\">Introduction<\/span><\/h2>\n<p>The internet has enabled online social interaction for scientists beyond physical meetings and conferences. Twitter, Facebook, and ResearchGate<sup id=\"rdp-ebb-cite_ref-EllisonTheBen07_1-0\" class=\"reference\"><a href=\"#cite_note-EllisonTheBen07-1\" rel=\"external_link\">[1]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-KwakWhatIs10_2-0\" class=\"reference\"><a href=\"#cite_note-KwakWhatIs10-2\" rel=\"external_link\">[2]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-ThelwallRes14_3-0\" class=\"reference\"><a href=\"#cite_note-ThelwallRes14-3\" rel=\"external_link\">[3]<\/a><\/sup> provide valuable online forums that many researchers use to share knowledge. At the same time, academic publishing remains time consuming and inefficient for communicating methodology. Protocols are often relegated to supplementary <a href=\"https:\/\/www.limswiki.org\/index.php\/Information\" title=\"Information\" target=\"_blank\" class=\"wiki-link\" data-key=\"6300a14d9c2776dcca0999b5ed940e7d\">information<\/a>, if shared at all. There is no good mechanism for easily discussing, troubleshooting, and improving published or unpublished techniques.\n<\/p><p>This need is even more apparent in emerging fields such as viral ecology where <a href=\"https:\/\/www.limswiki.org\/index.php\/Laboratory\" title=\"Laboratory\" target=\"_blank\" class=\"wiki-link\" data-key=\"c57fc5aac9e4abf31dccae81df664c33\">laboratory<\/a>, field, and <a href=\"https:\/\/www.limswiki.org\/index.php\/Bioinformatics\" title=\"Bioinformatics\" target=\"_blank\" class=\"wiki-link\" data-key=\"8f506695fdbb26e3f314da308f8c053b\">bioinformatics<\/a> methods are being actively developed.<sup id=\"rdp-ebb-cite_ref-WeinbauerManual10_4-0\" class=\"reference\"><a href=\"#cite_note-WeinbauerManual10-4\" rel=\"external_link\">[4]<\/a><\/sup> For example, new metagenomic techniques to sequence viral DNA directly from environmental samples has led to rapid advances in both molecular and bioinformatic protocols.<sup id=\"rdp-ebb-cite_ref-BrumRising15_5-0\" class=\"reference\"><a href=\"#cite_note-BrumRising15-5\" rel=\"external_link\">[5]<\/a><\/sup> These protocols, however, are highly specialized and generally used in a few highly proficient labs because: (i) viral metagenomes (viromes) are difficult to produce due to low quantities of DNA and refined isolation and purification methods, (ii) the vast majority of viral sequences are unknown (usually >90%<sup id=\"rdp-ebb-cite_ref-HurwitzThePacific13_6-0\" class=\"reference\"><a href=\"#cite_note-HurwitzThePacific13-6\" rel=\"external_link\">[6]<\/a><\/sup>) complicating bioinformatics analyses, and (iii) newly emerging comparative and functional metagenomic analyses exist but require on-going community refinement and development.\n<\/p><p>Given the experimental nature of methods, the virology community has expressed a need to foster discussions about these protocols towards improved methodologies and increasing connectivity and collaboration among researchers.<sup id=\"rdp-ebb-cite_ref-AVFacebook_7-0\" class=\"reference\"><a href=\"#cite_note-AVFacebook-7\" rel=\"external_link\">[7]<\/a><\/sup> The challenge is to develop a method-centered collaborative platform that recapitulates the functionality of a scientific meeting - a digital community for connecting with fellow researchers to share and discover state-of-the-art knowledge.\n<\/p><p>Here we describe new capabilities in protocols.io (<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.protocols.io\" target=\"_blank\">http:\/\/www.protocols.io<\/a>), an open-access platform, to create virtual communities for disseminating protocols and knowledge at the forefront of scientific development. To demonstrate these capabilities, we describe a viral ecology community forum called VERVENet (<a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.protocols.io\/groups\/verve-net\" target=\"_blank\">https:\/\/www.protocols.io\/groups\/verve-net<\/a>) that strives to increase connectivity and knowledge dissemination in viral ecology research at all levels, from undergraduates to accomplished viral ecologists. These new community features enhance a researcher\u2019s ability to discuss and share protocols, connect with fellow community members, and learn about new and innovative research in the field. The web-based software for developing virtual communities is free for use on protocols.io and is further described here.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Protocols.io:_A_platform_to_enable_methods_discussion_and_dissemination\">Protocols.io: A platform to enable methods discussion and dissemination<\/span><\/h2>\n<p>Protocols.io is a free service for industry and academic scientists to share or maintain private protocols for research.<sup id=\"rdp-ebb-cite_ref-TeytelmanProto15_8-0\" class=\"reference\"><a href=\"#cite_note-TeytelmanProto15-8\" rel=\"external_link\">[8]<\/a><\/sup> The driving force behind software development is to provide a mechanism for scientists to share improvements and corrections to protocols so that others are not continuously re-discovering knowledge that scientists have not had the time or wear-with-all to publish. Protocols.io provides a free, up-to-date, crowd-sourced protocol repository for the life science community. This software is available as a web-based platform or smart phone application<sup id=\"rdp-ebb-cite_ref-AppleProtocols_9-0\" class=\"reference\"><a href=\"#cite_note-AppleProtocols-9\" rel=\"external_link\">[9]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-GoogleProtocols_10-0\" class=\"reference\"><a href=\"#cite_note-GoogleProtocols-10\" rel=\"external_link\">[10]<\/a><\/sup> to enable mobile solutions for research and bench work. Per best practices in mobile computing, these apps offer extensive options and control of push notifications. In fall 2014, protocols.io offered a well-developed platform for users to share molecular methods; however, no capabilities were in place to share bioinformatics and other methods among groups. To this end, the viral ecology community teamed up with protocols.io to create new group capabilities, develop bioinformatics protocols, and enhance discussion forums for news, methods, and literature.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Introducing_VERVENet:_The_Viral_Ecology_Research_and_Virtual_Exchange_Network\">Introducing VERVENet: The Viral Ecology Research and Virtual Exchange Network<\/span><\/h3>\n<p>The Viral Ecology Research and Virtual Exchange Network (VERVENet) is a collaboration between the University of Arizona and protocols.io to deliver an online forum for the virology community. To enable this forum, new group functionality was built into protocols.io to promote scientific communication and collaboration. Specifically, group features were developed on top of existing capabilities to share molecular methods in order to (i) share protocols and their annotations and optimizations; (ii) fuel connectivity among viral ecology researchers for sharing data sets, knowledge, job postings, and conference announcements through a common online forum called VERVENet; and (iii) facilitate literature discovery through personalized recommendations to promote discussion on cutting edge viral ecology research. Through developing these interconnected resources in protocols.io for virtual communities, we developed a \u201cgo-to\u201d site for viral ecology research.<sup id=\"rdp-ebb-cite_ref-VERVEatPtotocols_11-0\" class=\"reference\"><a href=\"#cite_note-VERVEatPtotocols-11\" rel=\"external_link\">[11]<\/a><\/sup> Moreover, these tools are broadly useful to any community or individual lab for promoting scientific inquiry, reproduction of results, dissemination of protocols, and re-use. Specifically, new forums can be created in a matter of minutes to enable connectivity among groups of any size, with tools described here under use cases. The VERVE Net forum is a place to discuss newly emerging methods in viral ecology for any kind of data such as omics or image datasets. However, while images, videos, and tables can be added to protocols\/steps to enhance the description of methods, the protocols.io platform is not a data storage site.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Methods\">Methods<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"Creating_a_user_profile_in_protocols.io\">Creating a user profile in protocols.io<\/span><\/h3>\n<p>Users can view protocols and all public content anonymously, but to interact with the platform, registration is necessary. Registration is quick, as only email and password are required to create an account; however, users are encouraged to create profiles containing their name, website, affiliation, and research interests. Others can search and find a user based on name or keywords. Moreover, user profiles are attached to any material on protocols.io that the user posts publicly. User profiles also contain a field for ORCID<sup id=\"rdp-ebb-cite_ref-HaakORCID12_12-0\" class=\"reference\"><a href=\"#cite_note-HaakORCID12-12\" rel=\"external_link\">[12]<\/a><\/sup> so that researchers can tie their profile back to a common identifier and highlight their work in the field. Researchers can also include a biography that describes how they got into the field and what intrigues them. Thus, profiles allow users to add in their own content, rather than simply browse existing content.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Adding_protocols_in_protocols.io\">Adding protocols in protocols.io<\/span><\/h3>\n<p>After registration, new protocols can be entered (Figure 1). By default, all protocols are private and can be shared with individual collaborators or any of the groups. The protocols are structured with tabs for the \u201csteps,\u201d \u201cdescription,\u201d \u201cguidelines,\u201d and \u201ccomments.\u201d When entering the steps, a list of components that can be added to the steps is located on the far right and allows a clear detailing of wetlab or computational portions of the method. Related steps of the protocol can also be easily grouped together into sections such as \"preparation,\" \"DNA extraction,\" and \"analysis,\" etc. Steps may be entered one by one by typing into the text box or by pasting steps from another file, facilitating import of existing protocols. For each step, annotations can be added to make notes on specific steps. Once complete, the protocol can be run in a step-by-step format.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig1_Kindler_F1000Res2017_5.gif\" class=\"image wiki-link\" target=\"_blank\" data-key=\"95911cd0bbed54875227cf4673968b93\"><img alt=\"Fig1 Kindler F1000Res2017 5.gif\" src=\"https:\/\/www.limswiki.org\/images\/3\/33\/Fig1_Kindler_F1000Res2017_5.gif\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 1.<\/b> Entering a protocol in protocols.io. Protocols are entered by providing a broad description, information about authors, any prior materials or background required, and detailed step-by-step methods to implement the protocol. Protocols can remain private to an individual or group, or released to the public.<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<p>Once a protocol has been created, there are several options for sharing it with collaborators or a group. To make the protocol publicly viewable, one will need to click the \"publish\" button. A protocol can be reassigned to another individual with a protocols.io account. For ongoing development and changes to adding and using protocols, see the tutorials (<a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.protocols.io\/help\" target=\"_blank\">http:\/\/www.protocols.io\/help<\/a>) at protocols.io.<sup id=\"rdp-ebb-cite_ref-protoHelp_13-0\" class=\"reference\"><a href=\"#cite_note-protoHelp-13\" rel=\"external_link\">[13]<\/a><\/sup>\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Developing_groups_in_protocols.io\">Developing groups in protocols.io<\/span><\/h3>\n<p>To create a group, one must have an account and be logged in. For example, here we describe the VERVE Net group; however, it is possible to create any group. To create a group, users can click on their personal icon in the upper right-hand corner and select \u201c+ new group.\u201d They will be prompted to enter a group name, image, description of the group, research interests, external website address, physical location of the group, and an affiliation. The user will also decide if the group is open to anyone, by invitation only, or open to membership requests. In addition, the user can choose if the group is visible to others or private. Users are able to invite members into their group and control the privileges of their members. Moreover, as the owner of a group, the user is able to invite other subgroups, such as in the VERVE example, where individual labs are subgroups.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Finding_protocols\">Finding protocols<\/span><\/h3>\n<p>Protocols on protocols.io can be tagged to allow users to quickly find protocols or collections of protocols in a particular area of interest. Users can also find protocols or other content using the global search at the top of each page, allowing users to search within the entire forum or specific sections of the forum.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Providing_feedback_on_protocols\">Providing feedback on protocols<\/span><\/h3>\n<p>protocols.io offers three methods for feedback directly from users: twitter, email to protocols developers (info@protocols.io), and through a feedback forum where users and developers alike can respond. These comments are then used to fuel future development. Further, protocols.io recently initiated an ambassadors program where power users (usually graduate students or postdocs) that are directly connected to diverse communities provide feedback from a user perspective. Thus, future development is guided by community input from these sources.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Use_case:_VERVE_Net:_Virus_Ecology_Research_and_Virtual_Exchange_Network\">Use case: VERVE Net: Virus Ecology Research and Virtual Exchange Network<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"Molecular_and_bioinformatics_protocols\">Molecular and bioinformatics protocols<\/span><\/h3>\n<p>Often, detailed \u201ctricks of the trade\u201d associated with lab, field, and bioinformatics protocols are not well-described in publications, and at best they are stashed in supplemental materials. Practical information associated with running these protocols under varied conditions cannot be curated, documented, or discussed among students, postdocs, technicians, and faculty working in virology. Moreover, knowledge on when to use a particular version of a given protocol is not easily captured. Protocols.io provides a flexible mechanism wherein protocols can be documented in a step-wise fashion to easily pivot between molecular and bioinformatics methodologies, link to useful websites or code in Github<sup id=\"rdp-ebb-cite_ref-DabbishSocial12_14-0\" class=\"reference\"><a href=\"#cite_note-DabbishSocial12-14\" rel=\"external_link\">[14]<\/a><\/sup>, as well as reference manuals or original source materials for protocols, as exemplified in the VERVENet forum.\n<\/p><p>The user entering the protocol may not necessarily be the author of the original method. However, by providing links to the primary work, users can attribute credit to the original author while at the same time adding their own updates to the method either while they enter it, or at a later time. Further, other users have the capability to add notes and warnings to existing protocols in protocols.io. This functionality includes a mechanism to email the protocol author for protocol troubleshooting. Corrections and updates made by the protocol authors and users automatically trigger notifications emailed to researchers who use that protocol. Lastly, users can \"fork\" or copy existing protocols for further refinement or alternate uses while still maintaining links back to the original for credit and reference. As such, the protocol is a living document for the community to reuse and continually refine.\n<\/p><p>For publication, authors have the option to enter detailed methods into protocols.io, issue a digital object identifier (DOI<sup id=\"rdp-ebb-cite_ref-BatesDigital09_15-0\" class=\"reference\"><a href=\"#cite_note-BatesDigital09-15\" rel=\"external_link\">[15]<\/a><\/sup>), and link to the protocols.io record from the methods section. This practice is now being encouraged in journal submissions and by funding agencies.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Protocol_collections\">Protocol collections<\/span><\/h3>\n<p>Because protocols are often used in conjunction with other protocols, protocols.io has the capability to link protocols into user-defined workflows. This is particularly important for publications that may use a collection of varied protocols (field, lab, and bioinformatics) that are derived from many sources (protocols from the user or other users). In providing a collection of protocols associated with a publication, the authors enable their work to be replicated, easy-to-follow, and transparent to other members of the community in a way that can be referenced and cited. For example, a collection of protocols derived from a recent publication on the human skin double stranded DNA skin virome is available in VERVENet.<sup id=\"rdp-ebb-cite_ref-HanniganTheHuman15_16-0\" class=\"reference\"><a href=\"#cite_note-HanniganTheHuman15-16\" rel=\"external_link\">[16]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-HanniganTheHuman15_proto_17-0\" class=\"reference\"><a href=\"#cite_note-HanniganTheHuman15_proto-17\" rel=\"external_link\">[17]<\/a><\/sup> Thus, collections provide a mechanism for furthering open-science efforts.\n<\/p><p>Protocol collections also provide a mechanism to learn by example for early career scientists or those branching into a new area of scientific inquiry. In particular, detailed protocols associated with a toolkit or workshop \u2014 where multimedia options such as slides, video, or links to virtual machines with example datasets and code \u2014 can be included.<sup id=\"rdp-ebb-cite_ref-HurwitzQIIME15_18-0\" class=\"reference\"><a href=\"#cite_note-HurwitzQIIME15-18\" rel=\"external_link\">[18]<\/a><\/sup><sup id=\"rdp-ebb-cite_ref-CaporasoQIIME10_19-0\" class=\"reference\"><a href=\"#cite_note-CaporasoQIIME10-19\" rel=\"external_link\">[19]<\/a><\/sup> This is particularly important for bioinformatics protocols that often include multiple programs and steps in an analysis for a given publication. Further, individual tools may have a collection of protocols that describe specific use-cases, example datasets, and varied options that they may wish to convey to their users.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Groups_and_sharing\">Groups and sharing<\/span><\/h3>\n<p>Individual members can form groups, where the owner has the ability to choose the level of accessibility for fellow members. The groups can share literature recommendations, discussions, protocols, news, events, and job opportunities (Figure 2). Subgroups can form under the umbrella of a larger group with a common interest. This subgroup\/supergroup relationship allows smaller group activities to be shared with a larger virtual community with common interests. In the case of VERVENet, this supergroup links the broader research in virology with the subgroups of individual labs and more specific research interests such as plant viruses.\n<\/p><p><br \/>\n<a href=\"https:\/\/www.limswiki.org\/index.php\/File:Fig2_Kindler_F1000Res2017_5.gif\" class=\"image wiki-link\" target=\"_blank\" data-key=\"8bc381fbb64e54c590cfc059d0f8ab38\"><img alt=\"Fig2 Kindler F1000Res2017 5.gif\" src=\"https:\/\/www.limswiki.org\/images\/9\/9e\/Fig2_Kindler_F1000Res2017_5.gif\" style=\"width: 100%;max-width: 400px;height: auto;\" \/><\/a>\n<\/p>\n<div style=\"clear:both;\"><\/div>\n<table style=\"\">\n<tr>\n<td style=\"vertical-align:top;\">\n<table border=\"0\" cellpadding=\"5\" cellspacing=\"0\" style=\"\">\n\n<tr>\n<td style=\"background-color:white; padding-left:10px; padding-right:10px;\"> <blockquote><b>Figure 2.<\/b> The VERVENet group in protocols.io. Groups in protocols.io display information about the group objectives, members, subgroups, the group library and literature recommendations, group discussions, news, jobs, and events. Groups have the capacity to control access, from making groups and content public and allowing anyone to join, to restricted content and invitation only membership. VERVENet is an example of a public forum for virology.<\/blockquote>\n<\/td><\/tr>\n<\/table>\n<\/td><\/tr><\/table>\n<h3><span class=\"mw-headline\" id=\"Literature_recommendations\">Literature recommendations<\/span><\/h3>\n<p>Each of the groups includes a literature recommendation system. This algorithm provides personalized publication recommendations based on a library from a user or group. This algorithm is used to develop libraries for viral ecology user groups, that will continually recommend new publications based on growing reading lists from individual users who are part of the group. This functionality allows virologists to make their reading lists public, therefore helping new scientists joining the field in their topic area. The libraries from sub-groups also fuel the shared public reading list within the VERVENet group, therefore creating enhanced fluidity and cross-posted content between the groups.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Live_online_discussion_forum\">Live online discussion forum<\/span><\/h3>\n<p>Each of the groups in protocols.io contains a live online discussion. Discussions can be generated directly on the discussion tab or cross-posted from discussions on specific protocols, news, or literature. Each of the discussions can reference outside websites, manuals, or online resources. This discussion forum enables users to discuss tips and tricks for specific protocols, review reagents linked to particular protocols, and reference outside resources that were not included in the original protocol.\n<\/p><p>Protocols.io also includes \u201cjournal-club\u201d capabilities to enable on-line discussions of published research by researchers and authors. Other unique features in protocols.io include a career advice forum with a panel of mentors<sup id=\"rdp-ebb-cite_ref-protoCareer_20-0\" class=\"reference\"><a href=\"#cite_note-protoCareer-20\" rel=\"external_link\">[20]<\/a><\/sup> and a \u201cbehind the article\u201d essay forum.<sup id=\"rdp-ebb-cite_ref-PCEssay_21-0\" class=\"reference\"><a href=\"#cite_note-PCEssay-21\" rel=\"external_link\">[21]<\/a><\/sup> These communication forums allow researchers to share their stories about how papers, protocols, or research efforts came about, that are both interesting to the community and informative for early career scientists.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Platform_infrastructure_and_interoperability\">Platform infrastructure and interoperability<\/span><\/h3>\n<p>Computers, tablets, and smart phones are becoming fundamental tools for scientists today. Furthermore, social networking and shared cyberinfrastructures are offering powerful new mechanisms to connect communities and science from across the world. Protocols.io leverages these powerful new tools and software capabilities to provide an online forum for viral ecology research to connect and share knowledge and resources. All components of protocols.io and the VERVENet forum are mobile-friendly and interoperable for use on diverse devices in the lab, on the desktop, or on the go.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Content_and_adoption\">Content and adoption<\/span><\/h3>\n<p>The VERVENet group currently contains 365 live protocols, 212 news articles, and 59 job opportunities. There is an event calendar that contains workshops and conferences specific to virology through the fall of 2016. We have 231 members and 22 subgroups. Examples of subgroups include the Plant Virus Ecology Network, which originally formed in 2007<sup id=\"rdp-ebb-cite_ref-MalmstromTheExpand11_22-0\" class=\"reference\"><a href=\"#cite_note-MalmstromTheExpand11-22\" rel=\"external_link\">[22]<\/a><\/sup>; the Chlorovirus Group, ECOGEO<sup id=\"rdp-ebb-cite_ref-Wood-CharlsonECOGEO15_23-0\" class=\"reference\"><a href=\"#cite_note-Wood-CharlsonECOGEO15-23\" rel=\"external_link\">[23]<\/a><\/sup>; and 18 individual labs. The International Society for Viruses of Microorganisms has listed VERVE Net on their website as a resource.<sup id=\"rdp-ebb-cite_ref-ISVMResources_24-0\" class=\"reference\"><a href=\"#cite_note-ISVMResources-24\" rel=\"external_link\">[24]<\/a><\/sup>\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Discussion_and_conclusions\">Discussion and conclusions<\/span><\/h2>\n<p>The primary goal of new group functionality in protocols.io is to provide a robust web application for sharing up-to-date protocols, literature, and community features (news, jobs, discussions). This work is exemplified in VERVE Net, a virtual community forum for virology. Fundamental to this goal is the ability for researchers to establish groups based on similar interests and share knowledge, without <i>apriori<\/i> knowledge of key members in a given field.\n<\/p><p>We have designed an infrastructure that has multiple entry points for establishing relationships among users, ranging from self-proclaimed groups or areas of interest to options to join groups maintained by others in an area of interest to the user fueled by related protocols or reading lists. Moreover, news feeds about funding opportunities, job postings, or collaborative research opportunities can be fine-tuned according to interest. These connections will allow the forum to evolve naturally given rapidly developing trends and new protocols. Protocols.io is open-access and is both free-to-read and free-to-publish. The revenue and sustainability model is based on the sale of data services to reagent vendors (most popular protocols, protocol improvements, and reagent-protocol links). Protocols.io also charges fees for private non-academic groups.\n<\/p><p>Protocols.io is a central resource to connect, collaborate, share and innovate within virtual communities. The VERVENet forum demonstrates how this new group functionality allows researchers to promote scientific inquiry, reproduction of results, and dissemination and optimization of both molecular and bioinformatics protocols, as a virtual community.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"Declarations_and_acknowledgements\">Declarations and acknowledgements<\/span><\/h2>\n<h3><span class=\"mw-headline\" id=\"Data_and_software_availability\">Data and software availability<\/span><\/h3>\n<p>Protocols.io and the VERVENet commuity forum are committed to open access for data content and interoperability. To that end, the content in protocols is available through an application programming interface (API) for advanced data mining, and no registration is required to view protocols, comments, or annotations. All public protocols are archived with CLOCKSS for long-term digital preservation.<sup id=\"rdp-ebb-cite_ref-protoDevelop_25-0\" class=\"reference\"><a href=\"#cite_note-protoDevelop-25\" rel=\"external_link\">[25]<\/a><\/sup> Users will also be able to access public protocols.io mirrored at the Center for Open Science.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Author_contributions\">Author contributions<\/span><\/h3>\n<p>LK wrote the manuscript, tested the platform, added content, and provided feedback on features and functionality. AS developed the platform, tested and designed features and functionality. LT and BLH designed VERVENet, tested the system, provided feedback on features and functionality, and wrote the manuscript. All authors read and approved the final manuscript.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Competing_interests\">Competing interests<\/span><\/h3>\n<p>Leonid Teytelman and Alexei Stoliartchouk are employees of protocols.io and both own equity in the company.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Grant_information\">Grant information<\/span><\/h3>\n<p>This work was funded by a grant to B.L.H. and L.T. from the Gordon Betty Moore Foundation (GBMF4733).\n<\/p><p>The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.\n<\/p>\n<h3><span class=\"mw-headline\" id=\"Acknowledgements\">Acknowledgements<\/span><\/h3>\n<p>We would like to thank Celina Gomez and James Thornton for adding \u201cseed\u201d protocols into VERVENet, and Vladimir Frolov for development of the interface and group functionality.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"References\">References<\/span><\/h2>\n<div class=\"reflist references-column-width\" style=\"-moz-column-width: 30em; -webkit-column-width: 30em; column-width: 30em; list-style-type: decimal;\">\n<ol class=\"references\">\n<li id=\"cite_note-EllisonTheBen07-1\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-EllisonTheBen07_1-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Ellison, N.B.; Steinfield, C.; Lampe, C. (2007). \"The Benefits of Facebook \u201cFriends:\u201d Social Capital and College Students\u2019 Use of Online Social Network Sites\". <i>Journal of Computer-Mediated Communication<\/i> <b>12<\/b> (4): 1143\u20131168. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1111%2Fj.1083-6101.2007.00367.x\" target=\"_blank\">10.1111\/j.1083-6101.2007.00367.x<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+Benefits+of+Facebook+%E2%80%9CFriends%3A%E2%80%9D+Social+Capital+and+College+Students%E2%80%99+Use+of+Online+Social+Network+Sites&rft.jtitle=Journal+of+Computer-Mediated+Communication&rft.aulast=Ellison%2C+N.B.%3B+Steinfield%2C+C.%3B+Lampe%2C+C.&rft.au=Ellison%2C+N.B.%3B+Steinfield%2C+C.%3B+Lampe%2C+C.&rft.date=2007&rft.volume=12&rft.issue=4&rft.pages=1143%E2%80%931168&rft_id=info:doi\/10.1111%2Fj.1083-6101.2007.00367.x&rfr_id=info:sid\/en.wikipedia.org:Journal:Method-centered_digital_communities_on_protocols.io_for_fast-paced_scientific_innovation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-KwakWhatIs10-2\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-KwakWhatIs10_2-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Kwak, H.; Lee, C.; Park, H.; Moon, S. (2010). \"What is Twitter, a social network or a news media?\". <i>Proceedings of the 19th International Conference on World Wide Web<\/i> <b>2010<\/b>: 591-600. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1145%2F1772690.1772751\" target=\"_blank\">10.1145\/1772690.1772751<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=What+is+Twitter%2C+a+social+network+or+a+news+media%3F&rft.jtitle=Proceedings+of+the+19th+International+Conference+on+World+Wide+Web&rft.aulast=Kwak%2C+H.%3B+Lee%2C+C.%3B+Park%2C+H.%3B+Moon%2C+S.&rft.au=Kwak%2C+H.%3B+Lee%2C+C.%3B+Park%2C+H.%3B+Moon%2C+S.&rft.date=2010&rft.volume=2010&rft.pages=591-600&rft_id=info:doi\/10.1145%2F1772690.1772751&rfr_id=info:sid\/en.wikipedia.org:Journal:Method-centered_digital_communities_on_protocols.io_for_fast-paced_scientific_innovation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ThelwallRes14-3\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ThelwallRes14_3-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Thelwall, M.; Kousha, K. (2015). \"ResearchGate: Disseminating, communicating, and measuring Scholarship?\". <i>Journal of the Association for Information Science and Technology<\/i> <b>66<\/b> (5): 876\u2013889. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1002%2Fasi.23236\" target=\"_blank\">10.1002\/asi.23236<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=ResearchGate%3A+Disseminating%2C+communicating%2C+and+measuring+Scholarship%3F&rft.jtitle=Journal+of+the+Association+for+Information+Science+and+Technology&rft.aulast=Thelwall%2C+M.%3B+Kousha%2C+K.&rft.au=Thelwall%2C+M.%3B+Kousha%2C+K.&rft.date=2015&rft.volume=66&rft.issue=5&rft.pages=876%E2%80%93889&rft_id=info:doi\/10.1002%2Fasi.23236&rfr_id=info:sid\/en.wikipedia.org:Journal:Method-centered_digital_communities_on_protocols.io_for_fast-paced_scientific_innovation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-WeinbauerManual10-4\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-WeinbauerManual10_4-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Weinbauer, M.G.; Rowe, J.M.; Wilhelm, S.W., ed. (2010). <i>Manual of Aquatic Viral Ecology<\/i>. American Society of Limnology and Oceanography. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.4319%2Fmave.2010.978-0-9845591-0-7\" target=\"_blank\">10.4319\/mave.2010.978-0-9845591-0-7<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=book&rft.btitle=Manual+of+Aquatic+Viral+Ecology&rft.date=2010&rft.pub=American+Society+of+Limnology+and+Oceanography&rft_id=info:doi\/10.4319%2Fmave.2010.978-0-9845591-0-7&rfr_id=info:sid\/en.wikipedia.org:Journal:Method-centered_digital_communities_on_protocols.io_for_fast-paced_scientific_innovation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BrumRising15-5\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BrumRising15_5-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Brum, J.R.; Sullivan, M.B. (2015). \"Rising to the challenge: Accelerated pace of discovery transforms marine virology\". <i>Nature Reviews Microbiology<\/i> <b>13<\/b> (3): 147-59. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1038%2Fnrmicro3404\" target=\"_blank\">10.1038\/nrmicro3404<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/25639680\" target=\"_blank\">25639680<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Rising+to+the+challenge%3A+Accelerated+pace+of+discovery+transforms+marine+virology&rft.jtitle=Nature+Reviews+Microbiology&rft.aulast=Brum%2C+J.R.%3B+Sullivan%2C+M.B.&rft.au=Brum%2C+J.R.%3B+Sullivan%2C+M.B.&rft.date=2015&rft.volume=13&rft.issue=3&rft.pages=147-59&rft_id=info:doi\/10.1038%2Fnrmicro3404&rft_id=info:pmid\/25639680&rfr_id=info:sid\/en.wikipedia.org:Journal:Method-centered_digital_communities_on_protocols.io_for_fast-paced_scientific_innovation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HurwitzThePacific13-6\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HurwitzThePacific13_6-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Hurwitz, B.L.; Sullivan, M.B. (2013). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3585363\" target=\"_blank\">\"The Pacific Ocean virome (POV): A marine viral metagenomic dataset and associated protein clusters for quantitative viral ecology\"<\/a>. <i>PLoS One<\/i> <b>8<\/b> (2): e57355. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1371%2Fjournal.pone.0057355\" target=\"_blank\">10.1371\/journal.pone.0057355<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3585363\/\" target=\"_blank\">PMC3585363<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/23468974\" target=\"_blank\">23468974<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3585363\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3585363<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+Pacific+Ocean+virome+%28POV%29%3A+A+marine+viral+metagenomic+dataset+and+associated+protein+clusters+for+quantitative+viral+ecology&rft.jtitle=PLoS+One&rft.aulast=Hurwitz%2C+B.L.%3B+Sullivan%2C+M.B.&rft.au=Hurwitz%2C+B.L.%3B+Sullivan%2C+M.B.&rft.date=2013&rft.volume=8&rft.issue=2&rft.pages=e57355&rft_id=info:doi\/10.1371%2Fjournal.pone.0057355&rft_id=info:pmc\/PMC3585363&rft_id=info:pmid\/23468974&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3585363&rfr_id=info:sid\/en.wikipedia.org:Journal:Method-centered_digital_communities_on_protocols.io_for_fast-paced_scientific_innovation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-AVFacebook-7\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-AVFacebook_7-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.facebook.com\/AquaticViruses\/posts\/760704383968498\" target=\"_blank\">\"Aquatic Viruses\"<\/a>. <i>Facebook<\/i>. 02 July 2014<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.facebook.com\/AquaticViruses\/posts\/760704383968498\" target=\"_blank\">https:\/\/www.facebook.com\/AquaticViruses\/posts\/760704383968498<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 09 August 2016<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Aquatic+Viruses&rft.atitle=Facebook&rft.date=02+July+2014&rft_id=https%3A%2F%2Fwww.facebook.com%2FAquaticViruses%2Fposts%2F760704383968498&rfr_id=info:sid\/en.wikipedia.org:Journal:Method-centered_digital_communities_on_protocols.io_for_fast-paced_scientific_innovation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-TeytelmanProto15-8\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-TeytelmanProto15_8-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Teytelman, L.; Stoliartchouk, A. (2015). \"Protocols.io: Reducing the knowledge that perishes because we do not publish it\". <i>Information Services & Use<\/i> <b>35<\/b> (1\u20132): 109\u2013115. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.3233%2FISU-150769\" target=\"_blank\">10.3233\/ISU-150769<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Protocols.io%3A+Reducing+the+knowledge+that+perishes+because+we+do+not+publish+it&rft.jtitle=Information+Services+%26+Use&rft.aulast=Teytelman%2C+L.%3B+Stoliartchouk%2C+A.&rft.au=Teytelman%2C+L.%3B+Stoliartchouk%2C+A.&rft.date=2015&rft.volume=35&rft.issue=1%E2%80%932&rft.pages=109%E2%80%93115&rft_id=info:doi\/10.3233%2FISU-150769&rfr_id=info:sid\/en.wikipedia.org:Journal:Method-centered_digital_communities_on_protocols.io_for_fast-paced_scientific_innovation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-AppleProtocols-9\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-AppleProtocols_9-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">ZappyLab. <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/itunes.apple.com\/us\/app\/protocols.io\/id976303827\" target=\"_blank\">\"protocols.io\"<\/a>. <i>App Store<\/i><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/itunes.apple.com\/us\/app\/protocols.io\/id976303827\" target=\"_blank\">https:\/\/itunes.apple.com\/us\/app\/protocols.io\/id976303827<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 21 March 2016<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=protocols.io&rft.atitle=App+Store&rft.aulast=ZappyLab&rft.au=ZappyLab&rft_id=https%3A%2F%2Fitunes.apple.com%2Fus%2Fapp%2Fprotocols.io%2Fid976303827&rfr_id=info:sid\/en.wikipedia.org:Journal:Method-centered_digital_communities_on_protocols.io_for_fast-paced_scientific_innovation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-GoogleProtocols-10\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-GoogleProtocols_10-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">ZappyLab. <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/play.google.com\/store\/apps\/details?id=com.zappylab.protocols\" target=\"_blank\">\"protocols.io\"<\/a>. <i>Google Play<\/i><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/play.google.com\/store\/apps\/details?id=com.zappylab.protocols\" target=\"_blank\">https:\/\/play.google.com\/store\/apps\/details?id=com.zappylab.protocols<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 21 March 2016<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=protocols.io&rft.atitle=Google+Play&rft.aulast=ZappyLab&rft.au=ZappyLab&rft_id=https%3A%2F%2Fplay.google.com%2Fstore%2Fapps%2Fdetails%3Fid%3Dcom.zappylab.protocols&rfr_id=info:sid\/en.wikipedia.org:Journal:Method-centered_digital_communities_on_protocols.io_for_fast-paced_scientific_innovation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-VERVEatPtotocols-11\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-VERVEatPtotocols_11-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.protocols.io\/g\/verve-net\" target=\"_blank\">\"VERVE Net\"<\/a>. <i>protocols.io<\/i><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.protocols.io\/g\/verve-net\" target=\"_blank\">https:\/\/www.protocols.io\/g\/verve-net<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 21 March 2016<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=VERVE+Net&rft.atitle=protocols.io&rft_id=https%3A%2F%2Fwww.protocols.io%2Fg%2Fverve-net&rfr_id=info:sid\/en.wikipedia.org:Journal:Method-centered_digital_communities_on_protocols.io_for_fast-paced_scientific_innovation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HaakORCID12-12\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HaakORCID12_12-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Haak, L.L.; Fenner, M.; Paglione, L. et al. (2012). \"ORCID: A system to uniquely identify researchers\". <i>Learned Publishing<\/i> <b>25<\/b> (4): 259\u2013264. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1087%2F20120404\" target=\"_blank\">10.1087\/20120404<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=ORCID%3A+A+system+to+uniquely+identify+researchers&rft.jtitle=Learned+Publishing&rft.aulast=Haak%2C+L.L.%3B+Fenner%2C+M.%3B+Paglione%2C+L.+et+al.&rft.au=Haak%2C+L.L.%3B+Fenner%2C+M.%3B+Paglione%2C+L.+et+al.&rft.date=2012&rft.volume=25&rft.issue=4&rft.pages=259%E2%80%93264&rft_id=info:doi\/10.1087%2F20120404&rfr_id=info:sid\/en.wikipedia.org:Journal:Method-centered_digital_communities_on_protocols.io_for_fast-paced_scientific_innovation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-protoHelp-13\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-protoHelp_13-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.protocols.io\/help\/explore\" target=\"_blank\">\"Explore protocols.io\"<\/a>. ZappyLab, Inc<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.protocols.io\/help\/explore\" target=\"_blank\">https:\/\/www.protocols.io\/help\/explore<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 21 March 2016<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Explore+protocols.io&rft.atitle=&rft.pub=ZappyLab%2C+Inc&rft_id=https%3A%2F%2Fwww.protocols.io%2Fhelp%2Fexplore&rfr_id=info:sid\/en.wikipedia.org:Journal:Method-centered_digital_communities_on_protocols.io_for_fast-paced_scientific_innovation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-DabbishSocial12-14\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-DabbishSocial12_14-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Dabbish, L.; Stuart, C.; Tsay, J.; Herbsleb, J. (2012). \"Social coding in GitHub: Transparency and collaboration in an open software repository\". <i>Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work<\/i> <b>2012<\/b>: 1277-1286. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1145%2F2145204.2145396\" target=\"_blank\">10.1145\/2145204.2145396<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Social+coding+in+GitHub%3A+Transparency+and+collaboration+in+an+open+software+repository&rft.jtitle=Proceedings+of+the+ACM+2012+Conference+on+Computer+Supported+Cooperative+Work&rft.aulast=Dabbish%2C+L.%3B+Stuart%2C+C.%3B+Tsay%2C+J.%3B+Herbsleb%2C+J.&rft.au=Dabbish%2C+L.%3B+Stuart%2C+C.%3B+Tsay%2C+J.%3B+Herbsleb%2C+J.&rft.date=2012&rft.volume=2012&rft.pages=1277-1286&rft_id=info:doi\/10.1145%2F2145204.2145396&rfr_id=info:sid\/en.wikipedia.org:Journal:Method-centered_digital_communities_on_protocols.io_for_fast-paced_scientific_innovation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-BatesDigital09-15\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-BatesDigital09_15-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation book\">Paskin, N. (2009). \"Digital Object Identifier (DOI\u00ae) System\". In Bates, M.J.; Maack, M.N.. <i>Encyclopedia of Library and Information Sciences<\/i> (3rd ed.). CRC Press. pp. 1586\u201392. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/International_Standard_Book_Number\" target=\"_blank\">ISBN<\/a> 9780849397110.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Digital+Object+Identifier+%28DOI%C2%AE%29+System&rft.atitle=Encyclopedia+of+Library+and+Information+Sciences&rft.aulast=Paskin%2C+N.&rft.au=Paskin%2C+N.&rft.date=2009&rft.pages=pp.%26nbsp%3B1586%E2%80%9392&rft.edition=3rd&rft.pub=CRC+Press&rft.isbn=9780849397110&rfr_id=info:sid\/en.wikipedia.org:Journal:Method-centered_digital_communities_on_protocols.io_for_fast-paced_scientific_innovation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HanniganTheHuman15-16\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HanniganTheHuman15_16-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Hannigan, G.D.; Meisel, J.S.; Tyldsley, A.S. et al. (2015). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4620475\" target=\"_blank\">\"The human skin double-stranded DNA virome: Topographical and temporal diversity, genetic enrichment, and dynamic associations with the host microbiome\"<\/a>. <i>mBio<\/i> <b>6<\/b> (5): e01578-15. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1128%2FmBio.01578-15\" target=\"_blank\">10.1128\/mBio.01578-15<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4620475\/\" target=\"_blank\">PMC4620475<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/26489866\" target=\"_blank\">26489866<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4620475\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC4620475<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+human+skin+double-stranded+DNA+virome%3A+Topographical+and+temporal+diversity%2C+genetic+enrichment%2C+and+dynamic+associations+with+the+host+microbiome&rft.jtitle=mBio&rft.aulast=Hannigan%2C+G.D.%3B+Meisel%2C+J.S.%3B+Tyldsley%2C+A.S.+et+al.&rft.au=Hannigan%2C+G.D.%3B+Meisel%2C+J.S.%3B+Tyldsley%2C+A.S.+et+al.&rft.date=2015&rft.volume=6&rft.issue=5&rft.pages=e01578-15&rft_id=info:doi\/10.1128%2FmBio.01578-15&rft_id=info:pmc\/PMC4620475&rft_id=info:pmid\/26489866&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC4620475&rfr_id=info:sid\/en.wikipedia.org:Journal:Method-centered_digital_communities_on_protocols.io_for_fast-paced_scientific_innovation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HanniganTheHuman15_proto-17\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HanniganTheHuman15_proto_17-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Hannigan, G.D.; Meisel, J.S.; Tyldsley, A.S. et al. (10 March 2016). <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.protocols.io\/view\/The-Human-Skin-dsDNA-Virome-Topographical-and-Temp-ekubcww\" target=\"_blank\">\"The human skin double-stranded DNA virome: Topographical and temporal diversity, genetic enrichment, and dynamic associations with the host microbiome\"<\/a>. <i>protocols.io<\/i>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.17504%2Fprotocols.io.ekubcww\" target=\"_blank\">10.17504\/protocols.io.ekubcww<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.protocols.io\/view\/The-Human-Skin-dsDNA-Virome-Topographical-and-Temp-ekubcww\" target=\"_blank\">https:\/\/www.protocols.io\/view\/The-Human-Skin-dsDNA-Virome-Topographical-and-Temp-ekubcww<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=The+human+skin+double-stranded+DNA+virome%3A+Topographical+and+temporal+diversity%2C+genetic+enrichment%2C+and+dynamic+associations+with+the+host+microbiome&rft.atitle=protocols.io&rft.aulast=Hannigan%2C+G.D.%3B+Meisel%2C+J.S.%3B+Tyldsley%2C+A.S.+et+al.&rft.au=Hannigan%2C+G.D.%3B+Meisel%2C+J.S.%3B+Tyldsley%2C+A.S.+et+al.&rft.date=10+March+2016&rft_id=info:doi\/10.17504%2Fprotocols.io.ekubcww&rft_id=https%3A%2F%2Fwww.protocols.io%2Fview%2FThe-Human-Skin-dsDNA-Virome-Topographical-and-Temp-ekubcww&rfr_id=info:sid\/en.wikipedia.org:Journal:Method-centered_digital_communities_on_protocols.io_for_fast-paced_scientific_innovation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-HurwitzQIIME15-18\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-HurwitzQIIME15_18-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Hurwitz, B. (24 November 2015). <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.protocols.io\/view\/QIIME-Moving-Pictures-of-the-human-microbiome-d5288d\" target=\"_blank\">\"QIIME: Moving Pictures of the human microbiome\"<\/a>. <i>protocols.io<\/i>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.17504%2Fprotocols.io.d5288d\" target=\"_blank\">10.17504\/protocols.io.d5288d<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.protocols.io\/view\/QIIME-Moving-Pictures-of-the-human-microbiome-d5288d\" target=\"_blank\">https:\/\/www.protocols.io\/view\/QIIME-Moving-Pictures-of-the-human-microbiome-d5288d<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=QIIME%3A+Moving+Pictures+of+the+human+microbiome&rft.atitle=protocols.io&rft.aulast=Hurwitz%2C+B.&rft.au=Hurwitz%2C+B.&rft.date=24+November+2015&rft_id=info:doi\/10.17504%2Fprotocols.io.d5288d&rft_id=https%3A%2F%2Fwww.protocols.io%2Fview%2FQIIME-Moving-Pictures-of-the-human-microbiome-d5288d&rfr_id=info:sid\/en.wikipedia.org:Journal:Method-centered_digital_communities_on_protocols.io_for_fast-paced_scientific_innovation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-CaporasoQIIME10-19\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-CaporasoQIIME10_19-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Caporaso, J.G.; Kuczynski, J.; Stombaugh, J. et al. (2010). <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3156573\" target=\"_blank\">\"QIIME allows analysis of high-throughput community sequencing data\"<\/a>. <i>Nature Methods<\/i> <b>7<\/b> (5): 335-6. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1038%2Fnmeth.f.303\" target=\"_blank\">10.1038\/nmeth.f.303<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Central\" target=\"_blank\">PMC<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3156573\/\" target=\"_blank\">PMC3156573<\/a>. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/PubMed_Identifier\" target=\"_blank\">PMID<\/a> <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/20383131\" target=\"_blank\">20383131<\/a><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3156573\" target=\"_blank\">http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?tool=pmcentrez&artid=PMC3156573<\/a><\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=QIIME+allows+analysis+of+high-throughput+community+sequencing+data&rft.jtitle=Nature+Methods&rft.aulast=Caporaso%2C+J.G.%3B+Kuczynski%2C+J.%3B+Stombaugh%2C+J.+et+al.&rft.au=Caporaso%2C+J.G.%3B+Kuczynski%2C+J.%3B+Stombaugh%2C+J.+et+al.&rft.date=2010&rft.volume=7&rft.issue=5&rft.pages=335-6&rft_id=info:doi\/10.1038%2Fnmeth.f.303&rft_id=info:pmc\/PMC3156573&rft_id=info:pmid\/20383131&rft_id=http%3A%2F%2Fwww.pubmedcentral.nih.gov%2Farticlerender.fcgi%3Ftool%3Dpmcentrez%26artid%3DPMC3156573&rfr_id=info:sid\/en.wikipedia.org:Journal:Method-centered_digital_communities_on_protocols.io_for_fast-paced_scientific_innovation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-protoCareer-20\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-protoCareer_20-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.protocols.io\/g\/academic-career-forum\" target=\"_blank\">\"Academic Career Forum\"<\/a>. <i>protocols.io<\/i><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.protocols.io\/g\/academic-career-forum\" target=\"_blank\">https:\/\/www.protocols.io\/g\/academic-career-forum<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 21 March 2016<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Academic+Career+Forum&rft.atitle=protocols.io&rft_id=https%3A%2F%2Fwww.protocols.io%2Fg%2Facademic-career-forum&rfr_id=info:sid\/en.wikipedia.org:Journal:Method-centered_digital_communities_on_protocols.io_for_fast-paced_scientific_innovation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-PCEssay-21\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-PCEssay_21-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.pubchase.com\/essays?mostviewed\" target=\"_blank\">\"Essays\"<\/a>. <i>PubChase<\/i><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.pubchase.com\/essays?mostviewed\" target=\"_blank\">https:\/\/www.pubchase.com\/essays?mostviewed<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 21 March 2016<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Essays&rft.atitle=PubChase&rft_id=https%3A%2F%2Fwww.pubchase.com%2Fessays%3Fmostviewed&rfr_id=info:sid\/en.wikipedia.org:Journal:Method-centered_digital_communities_on_protocols.io_for_fast-paced_scientific_innovation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-MalmstromTheExpand11-22\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-MalmstromTheExpand11_22-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Malmstrom, C.M.; Melcher, U.; Bosque-P\u00e9rez, N.A.. \"The expanding field of plant virus ecology: historical foundations, knowledge gaps, and research directions\". <i>Virus Research<\/i> <b>159<\/b> (2): 84-94. <a rel=\"external_link\" class=\"external text\" href=\"http:\/\/en.wikipedia.org\/wiki\/Digital_object_identifier\" target=\"_blank\">doi<\/a>:<a rel=\"external_link\" class=\"external text\" href=\"http:\/\/dx.doi.org\/10.1016%2Fj.virusres.2011.05.010\" target=\"_blank\">10.1016\/j.virusres.2011.05.010<\/a>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+expanding+field+of+plant+virus+ecology%3A+historical+foundations%2C+knowledge+gaps%2C+and+research+directions&rft.jtitle=Virus+Research&rft.aulast=Malmstrom%2C+C.M.%3B+Melcher%2C+U.%3B+Bosque-P%C3%A9rez%2C+N.A.&rft.au=Malmstrom%2C+C.M.%3B+Melcher%2C+U.%3B+Bosque-P%C3%A9rez%2C+N.A.&rft.volume=159&rft.issue=2&rft.pages=84-94&rft_id=info:doi\/10.1016%2Fj.virusres.2011.05.010&rfr_id=info:sid\/en.wikipedia.org:Journal:Method-centered_digital_communities_on_protocols.io_for_fast-paced_scientific_innovation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-Wood-CharlsonECOGEO15-23\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-Wood-CharlsonECOGEO15_23-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\">Wood-Charlson, E.; DeLong, E.; Workshop I participants (24 November 2015). <a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.earthcube.org\/document\/2015\/2015ecogeo-final-report\" target=\"_blank\">\"ECOGEO 2015 Workshop I Final Report\"<\/a>. <i>EarthCube<\/i><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.earthcube.org\/document\/2015\/2015ecogeo-final-report\" target=\"_blank\">https:\/\/www.earthcube.org\/document\/2015\/2015ecogeo-final-report<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 21 March 2016<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=ECOGEO+2015+Workshop+I+Final+Report&rft.atitle=EarthCube&rft.aulast=Wood-Charlson%2C+E.%3B+DeLong%2C+E.%3B+Workshop+I+participants&rft.au=Wood-Charlson%2C+E.%3B+DeLong%2C+E.%3B+Workshop+I+participants&rft.date=24+November+2015&rft_id=https%3A%2F%2Fwww.earthcube.org%2Fdocument%2F2015%2F2015ecogeo-final-report&rfr_id=info:sid\/en.wikipedia.org:Journal:Method-centered_digital_communities_on_protocols.io_for_fast-paced_scientific_innovation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-ISVMResources-24\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-ISVMResources_24-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"http:\/\/www.isvm.org\/resources.html\" target=\"_blank\">\"Resources\"<\/a>. International Society for Viruses of Microorganisms<span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"http:\/\/www.isvm.org\/resources.html\" target=\"_blank\">http:\/\/www.isvm.org\/resources.html<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 21 March 2016<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=Resources&rft.atitle=&rft.pub=International+Society+for+Viruses+of+Microorganisms&rft_id=http%3A%2F%2Fwww.isvm.org%2Fresources.html&rfr_id=info:sid\/en.wikipedia.org:Journal:Method-centered_digital_communities_on_protocols.io_for_fast-paced_scientific_innovation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<li id=\"cite_note-protoDevelop-25\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-protoDevelop_25-0\" rel=\"external_link\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation web\"><a rel=\"external_link\" class=\"external text\" href=\"https:\/\/www.protocols.io\/developers\" target=\"_blank\">\"protocols.io for developers\"<\/a>. <i>protocols.io<\/i><span class=\"printonly\">. <a rel=\"external_link\" class=\"external free\" href=\"https:\/\/www.protocols.io\/developers\" target=\"_blank\">https:\/\/www.protocols.io\/developers<\/a><\/span><span class=\"reference-accessdate\">. Retrieved 21 March 2016<\/span>.<\/span><span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.btitle=protocols.io+for+developers&rft.atitle=protocols.io&rft_id=https%3A%2F%2Fwww.protocols.io%2Fdevelopers&rfr_id=info:sid\/en.wikipedia.org:Journal:Method-centered_digital_communities_on_protocols.io_for_fast-paced_scientific_innovation\"><span style=\"display: none;\"> <\/span><\/span><\/span>\n<\/li>\n<\/ol><\/div>\n<h2><span class=\"mw-headline\" id=\"Notes\">Notes<\/span><\/h2>\n<p>This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added.\n<\/p>\n<!-- \nNewPP limit report\nCached time: 20181214185728\nCache expiry: 86400\nDynamic content: false\nCPU time usage: 0.577 seconds\nReal time usage: 0.614 seconds\nPreprocessor visited node count: 18859\/1000000\nPreprocessor generated node count: 35304\/1000000\nPost\u2010expand include size: 130293\/2097152 bytes\nTemplate argument size: 43141\/2097152 bytes\nHighest expansion depth: 18\/40\nExpensive parser function count: 0\/100\n-->\n\n<!-- \nTransclusion expansion time report (%,ms,calls,template)\n100.00% 579.298 1 - -total\n 82.63% 478.702 1 - Template:Reflist\n 71.22% 412.554 25 - Template:Citation\/core\n 39.62% 229.493 11 - Template:Cite_journal\n 30.12% 174.495 12 - Template:Cite_web\n 12.19% 70.619 1 - Template:Infobox_journal_article\n 11.60% 67.206 1 - Template:Infobox\n 7.02% 40.654 80 - Template:Infobox\/row\n 6.58% 38.099 2 - Template:Cite_book\n 5.84% 33.832 22 - Template:Citation\/identifier\n-->\n\n<!-- Saved in parser cache with key limswiki:pcache:idhash:10412-0!*!0!!en!5!* and timestamp 20181214185727 and revision id 33731\n -->\n<\/div><div class=\"printfooter\">Source: <a rel=\"external_link\" class=\"external\" href=\"https:\/\/www.limswiki.org\/index.php\/Journal:Method-centered_digital_communities_on_protocols.io_for_fast-paced_scientific_innovation\">https:\/\/www.limswiki.org\/index.php\/Journal:Method-centered_digital_communities_on_protocols.io_for_fast-paced_scientific_innovation<\/a><\/div>\n\t\t\t\t\t\t\t\t\t\t<!-- end content -->\n\t\t\t\t\t\t\t\t\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t\t<!-- end of the left (by default at least) column -->\n\t\t<div class=\"visualClear\"><\/div>\n\t\t\t\t\t\n\t\t<\/div>\n\t\t\n\n<\/body>","b4dc927afe66d41e039d02b8df4b895f_images":["https:\/\/www.limswiki.org\/images\/3\/33\/Fig1_Kindler_F1000Res2017_5.gif","https:\/\/www.limswiki.org\/images\/9\/9e\/Fig2_Kindler_F1000Res2017_5.gif"],"b4dc927afe66d41e039d02b8df4b895f_timestamp":1544813847,"5558150b977a44d9e5f293e9ae7e49a1":{"type":"chapter","title":"1. Big data, informatics, and research","key":"5558150b977a44d9e5f293e9ae7e49a1"}},"link":"https:\/\/www.limswiki.org\/index.php\/Book:LIMSjournal_-_Spring_2018","price_currency":"","price_amount":"","book_size":"","download_url":"https:\/\/www.limsforum.com?ebb_action=book_download&book_id=78054","language":"","cta_button_content":"","toc":[{"type":"chapter","name":"1. Big data, informatics, and research","id":"5558150b977a44d9e5f293e9ae7e49a1","children":[{"type":"article","name":"Method-centered digital communities on protocols.io for fast-paced scientific innovation (Kindler et al. 2017)","id":"b4dc927afe66d41e039d02b8df4b895f","pageUrl":"https:\/\/www.limswiki.org\/index.php\/Journal:Method-centered_digital_communities_on_protocols.io_for_fast-paced_scientific_innovation"},{"type":"article","name":"Moving ERP systems to the cloud: Data security issues (Saa et al. 2017)","id":"f83633bd19906c97fe01cf5c6de8eb6e","pageUrl":"https:\/\/www.limswiki.org\/index.php\/Journal:Moving_ERP_systems_to_the_cloud:_Data_security_issues"},{"type":"article","name":"Big data management for cloud-enabled geological information services (Zhu et al. 2018)","id":"ec047b57c5e01fb4daaaffc7b376efce","pageUrl":"https:\/\/www.limswiki.org\/index.php\/Journal:Big_data_management_for_cloud-enabled_geological_information_services"}]},{"type":"chapter","name":"2. Bio- and cheminformatics","id":"6608804a5c96c03ce9770441da883658","children":[{"type":"article","name":"Developing a bioinformatics program and supporting infrastructure in a biomedical library (Hosburgh 2018)","id":"a3349d5e1cf1d4519948fbbbfffe0deb","pageUrl":"https:\/\/www.limswiki.org\/index.php\/Journal:Developing_a_bioinformatics_program_and_supporting_infrastructure_in_a_biomedical_library"},{"type":"article","name":"Closha: Bioinformatics workflow system for the analysis of massive sequencing data (Ko et al. 2018)","id":"684d7a3a2f6583b431b16f4884c7c07d","pageUrl":"https:\/\/www.limswiki.org\/index.php\/Journal:Closha:_Bioinformatics_workflow_system_for_the_analysis_of_massive_sequencing_data"},{"type":"article","name":"SistematX, an online web-based cheminformatics tool for data management of secondary metabolites (Scotti et al. 2018)","id":"478e0fc6bdbd74f64773b750f2c9edcc","pageUrl":"https:\/\/www.limswiki.org\/index.php\/Journal:SistematX,_an_online_web-based_cheminformatics_tool_for_data_management_of_secondary_metabolites"}]},{"type":"chapter","name":"3. Health, public health, and clinical informatics","id":"e20bd57b1986b44204a29c8419326fe3","children":[{"type":"article","name":"Developing a customized approach for strengthening tuberculosis laboratory quality management systems toward accreditation (Albert et al. 2017)","id":"15471c0a609cecac0db384f57371da08","pageUrl":"https:\/\/www.limswiki.org\/index.php\/Journal:Developing_a_customized_approach_for_strengthening_tuberculosis_laboratory_quality_management_systems_toward_accreditation"},{"type":"article","name":"Characterizing and managing missing structured data in electronic health records: Data analysis (Beaulieu-Jones et al. 2018)","id":"8159b0ee46c6326792ce28d0e7506e33","pageUrl":"https:\/\/www.limswiki.org\/index.php\/Journal:Characterizing_and_managing_missing_structured_data_in_electronic_health_records:_Data_analysis"},{"type":"article","name":"Evidence-based design and evaluation of a whole genome sequencing clinical report for the reference microbiology laboratory (Crisan et al. 2017)","id":"428fb6eb50c74d741daa88c4061eeab2","pageUrl":"https:\/\/www.limswiki.org\/index.php\/Journal:Evidence-based_design_and_evaluation_of_a_whole_genome_sequencing_clinical_report_for_the_reference_microbiology_laboratory"},{"type":"article","name":"Information management for enabling systems medicine (Ganzinger and Knaup 2017)","id":"a68557faaf217ce0a165c006c2605bb8","pageUrl":"https:\/\/www.limswiki.org\/index.php\/Journal:Information_management_for_enabling_systems_medicine"},{"type":"article","name":"Generating big data sets from knowledge-based decision support systems to pursue value-based healthcare (Gonz\u00e1lez-Ferrer et al. 2018)","id":"75f1e35ff0bfbfc1a106a26c2f646394","pageUrl":"https:\/\/www.limswiki.org\/index.php\/Journal:Generating_big_data_sets_from_knowledge-based_decision_support_systems_to_pursue_value-based_healthcare"},{"type":"article","name":"Big data and public health systems: Issues and opportunities (Rojas de la Escalera and Carnicero Gim\u00e9nez de Azc\u00e1rate 2018)","id":"b94dc07071fd3149fbecd75f93d73558","pageUrl":"https:\/\/www.limswiki.org\/index.php\/Journal:Big_data_and_public_health_systems:_Issues_and_opportunities"}]}],"settings":{"show_cover":"1","show_title":"1","show_subtitle":"0","show_full_title":"1","show_editor":"1","show_editor_pic":"1","show_publisher":"1","show_language":"1","show_size":"1","show_toc":"1","show_content_beneath_cover":"1","cta_button":"1","content_location":"1","toc_links":"disabled","log_in_msg":"<span><\/span> Please log in to read online.","cover_size":"medium"},"title_image":"https:\/\/s3.limsforum.com\/www.limsforum.com\/wp-content\/uploads\/Fig3_Scotti_Molecules2018_23-1.png"}}
LIMSjournal - Spring 2018
Volume 4, Issue 1
Editor: Shawn Douglas
Publisher: LabLynx Press
Copyright LabLynx Inc. All rights reserved.