Category Archives: Analytics

Optimizing Institutional Approaches to Core Facility Investment to Enable Research – NORDP 2015

Download “Optimizing Institutional Approaches to Core Facility Investment to Enable Research” from the National Organization of Research Development Professionals (NORDP) 2015 Conference, Bethesda, Maryland: [PDF]

Full text:

NORDP 2015

Optimizing Institutional Approaches to Core Facility Investment to Enable Research

Jeff Horon, Consultant, Elsevier Research Intelligence


In “Optimizing Institutional Approaches to Enable Research,” authors Grieb, Horon, Wong, Durkin, and Kunkel present a comprehensive set of best practices for providing leading-edge core facilities that contribute to the successful execution of research and increase competitiveness for external sponsorship. The authors conclude:

“…. This approach has created a number of standardized, transparent processes to effectively manage central infrastructure that enables enterprise-wide research, including a process for capital equipment planning, a procedure to evaluate new cores, a method for reviewing and managing the lifecycle of existing cores (invest, maintain, or sun-down), an investment in the administration and operational efficiencies of the cores, and support for the development and implementation of new methodologies for our investigators. The execution of these processes has provided faculty with forward-looking technologies to facilitate innovative research and provide a competitive edge for extramural support.”

Therefore the mechanisms for improvement of core facility management and the tangible benefits thereof are understood, but it is often initially not understood how to identify and diagnose sub-optimal funds flows and investment decisions. Funds flows, particularly those related to capital equipment depreciation, can have significant effects on core facility fees to investigators, indirect cost recovery, and availability of funds for equipment replacement/upgrades and provision of new services. Increased understanding of these funds flows can lead to better investment decisions involving strategic allocation of funds to urgent equipment and facility needs as identified by scientific advisory (versus haphazard or ‘hat in hand’ voluntary fundraising models) and periodic review, both to elicit new services investigators would benefit from and to phase out services that have become inefficient or commoditized.

Understanding Funds Flows

Capital equipment ‘on core facility books’ vs…

Capital equipment costs may:

-be factored into investigator-facing costs, reducing the need for subsidization and providing automatic return of funds to repair, replace, and upgrade equipment; however, higher investigator-facing costs may also reduce perceived competitiveness and/or utilization

-fall into capped cost pools, reducing overall indirect cost recovery to the institution

…. ‘on university books’

Capital equipment costs may:

-be factored out of investigator-facing costs, increasing perceived competitiveness and/or utilization; however, funds flows need to be understood and managed such that there are funds to repair, replace, and upgrade equipment; increased subsidization may be required, and some of the benefits may accrue to users external to the institution

-fall into uncapped cost pools, increasing overall indirect cost recovery to the institution

Investment Decision Framework

(adapted from Grieb, et. al., [i] Fig. 1)


By understanding funds flows, institutions can enable strategic decision-making, such as the core facility investment decision framework presented in Grieb, et. al.

In particular, the existence of designated funds for equipment repair, replacement, upgrades, and new equipment purchases implies that there will be input from a scientific advisory board (“What sorts of new equipment and services do our investigators require?”) and/or executive leadership, determining how funds will be allocated from a strategic perspective.

This comprehensive view may lead to further improvements in business processes, e.g. phasing out services that have been commoditized.


[i] Grieb, et. al. “Optimizing Institutional Approaches to Enable Research.” Journal of Research Administration Fall 2014. Vol. XLV. No. 2:

Emerging Methods and Tools for Sparking New Global Creative Networks – COINs15 Tokyo

Download “Emerging Methods and Tools for Sparking New Global Creative Networks” from the Proceedings of the 5th International Conference on Collaborative Innovation Networks (COINs15), Tokyo, Japan – Paper: [PDF] [arXiv] [arXiv PDF] Presentation: [PDF] Poster: [PDF]

Full text:

Proceedings of the 5th International Conference on Collaborative Innovation Networks (COINs15)

Emerging Methods and Tools for Sparking New Global Creative Networks

Jeff Horon, Elsevier Research Intelligence, 360 Park Ave S, New York, NY, 10010, USA


Emerging methods and tools are changing the ways participants in global creative networks become aware of each other and proceed to interact.  These methods and tools are beginning to influence the collaboration opportunities available to network participants.
Some web-based resources intended to spark new collaborations in creative networks have been plagued by dependence on fragmented or out-of-date information, having shallow recall (e.g. by being limited to a list of manually curated keywords), offering poor interconnectivity with other systems, and/or obtaining low end-user adoption.

Increased availability of information about creative network participants’ activities and outputs (such as completed sponsored research projects and published results, aggregated into global databases), coupled with advancement in information processing techniques like Natural Language Processing (NLP), enables new web-based technologies for discovering subject matter experts, facilities, and networks of current and potential collaborators.  Large-scale data resources and NLP allow modern versions of these tools to avoid the problems of having sparse/fragmented data and also provide for deep recall, sometimes within and across many disciplinary vocabularies.  These tools are known as “passive” technologies, from the perspective of the creative network participant, because the agent must undertake an action to use the information resources placed at his or her disposal.

Emerging “active” methods and tools utilize the same types of information and technologies, but actively intervene in the formation of the creative network by suggesting connections and arranging virtual or physical interactions.  Active approaches can achieve very high end-user adoption rates.

Both active and passive methods strive to use data-driven approaches to form better-than-chance awareness among networks of potential collaborators.  Modern instances of both types of systems generally support interconnectivity with other systems, and therefore expand the size of participants’ networks, resulting in a larger pool of potential collaborators from which to draw upon, within the system and additionally wherever the data is repurposed (e.g. into federated searches and customized applications).

Examples and Case Studies

“Passive” Networking Applications

The most widely deployed applications (providers) are: the Pure Experts portal (Elsevier), VIVO (DuraSpace), and Harvard Profiles (Harvard Medical School).  Each of these applications facilitates search and discovery of subject matter experts and their research activities and outputs.  These systems are generally organized and supported at the university level.  These applications are also federated into multi-institutional search frameworks including Direct2Experts and CTSAsearch – both of which are open to all three of the networking applications above, as well as other less widely deployed applications.

“Active” Networking Applications

Efforts toward active networking interventions are sometimes made with ‘researcher speed dating’ activities, but these generally rely on an audience with some mutual interests being gathered together (e.g. at a conference or symposium) and pairings are typically random.  Despite the existence of predictive factors for propensity to collaborate and likelihood of achieving team goals (e.g. obtaining external funding for research projects)[i], data-driven active networking methods are comparatively rarely used.  Prior case studies in active networking include:

Team design for large center and team science proposals

The University of Michigan Medical School assisted a principal investigator applicant for a large center grant with team formation, based on identifying potential participants publishing or having sponsored projects in subject matter related to the center.  This allowed for discovery of related expertise by analyzing term co-occurrence, and then discovery of the subject matter experts working with those concepts.  Multiple rounds of iteration resulted in a list of keywords, stemmed to related key terms, such that the list was both inclusive of the desired family of concepts and exclusive of ‘false positive’ matches.

Suggested casual interactions at a physical event

At an institute launch event, the University of Michigan employed search methods similar to those above for objective detection of researchers working in related topic areas, to supplement institute founders’ knowledge of researchers working in relevant topic areas with information about previously-unknown researchers also working in these topic areas.  Objective detection allowed for increased inclusiveness and comprehensiveness of the launch conference invitee list.

Launch event organizers solicited survey responses from participants concerning areas of methodological expertise, methodological needs for upcoming projects, and areas of interest within several pre-identified areas related to the institute.

Attendees were matched based upon expressing strong mutual interest in a topic and/or by study method, in situations where one researcher expressed a need for expertise in a method and another research expressed the ability to share methodological expertise in the same method.  Reciprocal methodological need/provision matches were considered especially strong matches (Figure 1):


Figure 1:   A generalized example of an especially strong match

Existing collaboration data covering co-authored publications and co-participation on sponsored projects were used to rule out matches who had collaborated in the past.

To maximize the chances strong matches would interact, the seating chart was also arranged to place strong matches at the same tables.  This event also included conversation-provoking material, including a visualization of attendees arranged in a social networking diagram according to indicated areas of strong interest.

The matching process proved to be very flexible and was used to support a novel approach to bridging mentorship gaps in pediatric research[ii].

Scheduled interactions at a physical event

The University of Texas System M.D. Anderson Cancer Center has in recent years built into a key global cancer conference activities for scheduled networking interactions.  The survey mechanism is similar to the University of Michigan example above, as are the recommendations, but there is also accommodation for arranging meetings including generally a mix of online meeting coordination, dedicated meeting time available, and dedicated meeting spaces available.  Given rotating global locations and varied attendees from year-to-year, priority is given to matches from different institutions as there may only be one time they are physically co-located.

In addition to the meetings booked during a specific speed dating event window in the conference program, the project team also noted a number of off-hours and informal meetings taking place, driven in part by the recommended matches.


These emerging methods and tools suggest the existence of repeatable strategies for facilitating data-driven matching and better-than-chance interactions designed to spark new global creative networks.  As these methods become further systematized and see wider adoption, they are poised to influence larger numbers of creative networks and their participants.

[i] Lungeanu, A., Huang, Y., and Contractor, N.S. (2014) “Understanding the assembly of interdisciplinary teams and its impact on performance.” Journal of Informetrics.  8(1):59-70.

[ii] Nigrovic, P.A., Muscal, E., Riebschleger, M., et. al. (2014) “AMIGO: A Novel Approach to the Mentorship Gap in Pediatric Rheumatology” Journal of Pediatrics 164(2):226-7.e1-3.

Optimizing Institutional Approaches to Enable Research – Journal of Research Administration

Our article “Optimizing Institutional Approaches to Enable Research” is now available in the Journal of Research Administration Volume XLV, No. 2.

Society of Researchers International members can access the article at:

From the editor:

In “Optimizing Institutional Approaches to Enable Research”, Grieb and co-authors focus on a key requirement of research administrators, that of ensuring there is adequate infrastructure to create the backbone for cutting edge research. Within the constraints of a university budget, core facilities must be sustained and replaced in order to compete for extramural funding. “The historic high-end, self-sufficient laboratories have been mostly replaced by laboratories that rely on institutionally supported infrastructure (i.e. core facilities).” Decision making about what to support, the cost of the support and the replacement of the core facilities is often not well managed. An institutional approach for enhancing the effectiveness of core infrastructure operations by implementing process improvements, managing the lifecycle of core facilities, and monitoring key core facilities’ metrics is described. In doing so, it addresses one of the key concerns raised in the article by Derrick and Nickson, that strategies that engage researchers, promote communication between administrators and researchers, and lead to a collaborative approach to streamline bureaucratic processes engenders success.

Visualization for Research Management – VIVO 2014

Download “Visualization for Research Management” from the VIVO 2014 conference: PDF


Universities and funding bodies are placing increasing emphasis on return on investment (ROI) related to research. Research managers at all levels need objective metrics and data, further developed into visualizations, that provide insights to support decisions about investments and that also promote understanding of the outcomes of those decisions.

It is critical to have visibility to both inputs and outputs related to research, and VIVO-compatible data can be used for these purposes. Examples include:

-An organizational dashboard used by top-level administrators at a large research organization (>$0.5 Billion annual research expenditure)

-Benchmarking and collaboration analysis

-Faculty activity reporting

-Recruitment and retention analysis

Evidence-based Metrics for Research Performance Strategies – NORDP 2014

Download “Evidence-based Metrics for Research Performance Strategies” from the Pre-NORDP 2014 Workshop: PDF

This presentation covers:

What are metrics?
+How to develop good metrics

Metrics for research
+How to develop good research metrics

Expanding research dashboard metrics to benchmarking and collaboration

Drilling beneath research dashboard metrics for advanced use


With increased competition in the US R&D landscape, research institutions are taking a strategic approach to research and collaboration strategies. Structured data sources, evidence-based metrics, and collaboration and benchmarking tools, as delivered by the new Pure Experts Portal, are increasingly being used by research managers to inform decision making and to enhance their institutions research strategy. Current users of SciVal Experts will share case studies and how they have used the web services and functionality of SciVal Experts to address critical institution needs.

Developing Forward-Looking Metrics and Reporting – University of Michigan StaffWorks 2011

Materials from our presentation Developing Forward-Looking Metrics and Reporting are now available: PDF

Or download our poster: PDF


Break your business unit out of the cycle of wanting to be more forward-looking but never actually developing metrics or producing reporting to support that goal. Learn best practices and hear practical advice for developing forward-looking metrics across the complete life cycle of metric development, including: metric creation and validation, building awareness and acceptance among leadership, standardization and refinement, and integration into existing reporting.

Places & Spaces: Mapping Science Exhibit

The Places & Spaces: Mapping Science Exhibit visited the University of Michigan Hatcher Graduate Library from March 7 through May 24, 2011.

From my trip to the exhibit — the Network Science Research poster from University of Michigan contributors featured prominently at the entrance:


 From the poster:

The works presented here were invited to accompany the international Places & Spaces: Mapping Science exhibit on display at the University of Michigan Library in March 1 to May 24, 2011.

The maps showcase excellence in network science research and education at the University of Michigan. They were created in many different areas of science, including medicine, bioinformatics, sociology, organizational studies, information science, and physics, to advance our understanding of the structure and dynamics of networks.

On a practical side, Jeff Horon’s map demonstrates the innovative usage of science analysis and mapping in support of team formation and grant proposal writing that helped secure a P30 award by National Institutes of Health that finances a new Nutrition Obesity Research Center at the University of Michigan.

It is our hope that the Mapping Science exhibit will inspire future collaborations across disciplinary and geospatial boundaries, the application of effective visualization techniques to communicate the results of these studies, and the adoption of advanced data analysis and visualization methods to improve daily decision making.

Download the poster: PDF

The exhibit features maps describing the power of maps, reference systems, and forecasts as well as science maps for economic decision makers, policy makers, and scholars. These are the ‘stars of the show’ but I won’t go into detail here since they are already well-described on the exhibit website.

The exhibit also features WorldProcessor Globes reflecting ‘Patterns of Patents & Zones of Invention’ over time and across geography:


Patterns of Patents & Zones of Invention [WorldProcessor #286]

“This globe plots the total amount of patents granted worldwide, beginning in 1883 with just under 50,000, hitting 650,000 in 1993 (near the North Pole), and (shifting the scale to the southern hemisphere) continuing to 2002 on a rapid climb towards 1 million. Geographic regions where countries offer environments conducive to fostering innovation are represented by topology. Additionally, nations where residents are granted an average of 500 or more US patents per year are called out in red by their respective averages in the years after 2000. © 2005 Ingo Gunther”


And the ‘Shape of Science’ given “quantified connectivities and relative flows of inquiry within the world of science”:


“This rendering is of a prospective tangible sculpture of the Shape of Science, based on the research of Richard Klavans and Kevin Boyack, spatializing the quantified connectivities and relative flows of inquiry within the world of science. © 2006 Ingo Gunther w/ Stephen Oh”


The exhibit also includes interactive video monitors and hands-on maps for kids.

 Attributions and Acknowledgements:

Network Science Research Poster

“Several University of Michigan faculty created maps included in the exhibit: Santiago Schnell, Molecular and Integrative Physiology; Lada Adamic, School of Information; M. E. J. Newman, Physics; Jeff Horon, Medical School; Helena Buhr, Natalie Cotton, and Jason Owen-Smith, Sociology and Organizational Studies.”


Overall Exhibit

Places & Spaces is curated Dr. Katy Börner and Michael J. Stamper, School of Library and Information Science at Indiana University. Places & Spaces also receives input from the Places and Spaces Advisory Board. The exhibit is sponsored by the National Science Foundation under Grant No. IIS-0238261, CHE-0524661, IIS-0737783 and IIS-0715303; the James S. McDonnell Foundation; Thomson Scientific/Reuters; Elsevier; the Cyberinfrastructure for Network Science Center, University Information Technology Services, and the School of Library and Information Science, all three at Indiana University. Some of the data used to generate the science maps is from Thomson Scientific/Reuters.

Thank you to Dr. Katy Börner and Michael J. Stamper for curating the exhibit. Thank you to Tim Utter and Rebecca Hill for bringing the exhibit to the University of Michigan.

This exhibit was mentioned in press at:

Heritage Newspapers Online and Print Editions

Simple Text Mining for a Known Lexicon in Excel/VBA

My Excel/VBA package for simple text mining for working with a known lexicon is now available:  License it for free.

From the University of Michigan Office of Technology Transfer Site:

Title: Simple Text Mining

Technology # 4730


Currently, there is a lack of text/network mining software available to the typical analyst end-user. Generally available text mining algorithms require extensive programming to implement. Typically, these more complex algorithms have an extremely steep learning curve, requiring a long-term commitment of professional software developer resources. Such solutions usually cannot be implemented by the typical analyst or small business.

Technology Description

The University of Michigan has developed an Excel-based tool and algorithm for text mining that ‘reads’ blocks of unstructured text for each word in a lexicon (supplied by the user) and assembles the words found into a common network analysis data structure called an “edge list.” This analysis includes additional descriptive data concerning the weight of lexicon words found. This ‘weight’ output allows for analysis of terms found. The network output allows for analysis of term “adjacency,” i.e. appearing together in the same block of unstructured text, the computation of network analysis measures, and the production of network visualizations. Outputs include user-specified data dimensions, carried over from the text input, for easily cross-referenced and more descriptive output.

• Analysis of unstructured text for a large number of known lexical terms
• Analysis of occurrence and adjacency (co-occurrence) of terms in papers, abstracts, etc.

• Approachability / ease-of-use (single-click processing of input text)
• Easy copy/paste of input/output data

Software and Copyright/Algorithms & Signal Processing
Software and Copyright/Opensource

How the algorithm works (click for a larger view):


Excel Chart Templates

It’s easier to communicate when your data is the most prominent feature of your chart.  Start from good templates.

Basic Excel charts draw focus to themselves instead of the data at hand, by defaulting to include dark gridlines, dark lines and tick marks on each axis, a dark border, color-coded series, and indirect labeling. However, visualization master Edward Tufte and others have taught us that less is often more. By avoiding ‘non-data ink,’ chartjunk, and formatting ‘gloss,’ we can improve the visual clarity of — and therefore the effectiveness of — our data visualizations.

Time is valuable. This means that we should use tools that are good by default. To that end, I have created a series of templates for the six basic Excel chart types.

The basic formatting choices that distinguish these charts from Excel defaults are: light gray gridlines, no axis lines, no tick marks, no borders, and no legend [if you need to describe multiple series, consider the technique of small multiples]. If color encoding becomes necessary, you’ll have to do this manually (as was done for the pie chart at the bottom).

Area Chart – Download Area Chart [.crtx] Template


Bar Chart – Download Bar Chart [.crtx] Template


Column Chart – Download Column Chart [.crtx] Template


Line Chart – Download Line Chart [.crtx] Template


Pie Chart – Download Pie Chart [.crtx] Template


(you’ll need to recolor your series manually to achieve the monochromatic blue effect)

Scatter Chart – Download Scatter Chart [.crtx] Template


 To use these templates, save them to your template directory, which is probably:

C:\Documents and Settings\username\Application Data\Microsoft\Templates\Charts

Then, the next time you want to insert a chart, select ‘All Chart Types’ from the bottom of any ‘Insert’ –> ‘Chart’ menu and then ‘Templates.’ You should see any templates saved into your templates directory as options.