MWSUG 2019 Paper Presentations

Paper presentations are the heart of a SAS users group meeting. MWSUG 2019 will feature nearly one hundred paper presentations organized into several academic sections covering a variety of topics and experience levels.

Note: Content and schedule are subject to change. Last updated 22-Sep-2019.

Analytics
Business Leadership
Hands On Workshop
Industry
Rapid Fire
SAS 101 Plus
SAS Super Demo
e-Poster

Analytics

Paper No.	Author(s)	Paper Title (click for abstract)
AL-041	Jack Shoemaker	No Wrong Door
AL-047	Joe Matise	Connecting to Datasets through Python and SAS
AL-059	Anurag Kulkarni	Track SAS Viya user logins to identify usage and drive improvement using actionable Intelligence: navigate visual reports, track KPIs and visualize long term trends.
AL-078	LeRoy Bessler	Machine Learning? The Machine Already Knows. Software-Intelligent Application Development Provides Reliability, Reusability, Extendability, and Maintainability for Strong Smart Systems in an Ever-Changing World
AL-111	David Corliss	Legislative Action on Human Trafficking: Towards a Data-Driven Policy
AL-114	Alan Zablocki	Deploying SAS Viya Docker images to the cloud - a step by step guide
AL-115	Alan Zablocki	Deploying SAS Viya to Docker - a practical guide for data scientists
AL-127	Charu Shankar	Why choose between SAS Data Step and PROC SQL when you can have both?
AL-128	Charu Shankar	Stacking Up - Horizontal or Vertical with PROC SQL or DATA Step
AL-129	Chevell Parker	Power Up Your Reporting Using the SAS® Output Delivery System
AL-132	Margaret Crevar	Gaining Comfort With SAS® Grid Computing for Analysts and Data Scientists
AL-134	Margaret Crevar	Top SAS® Architecture Pitfalls: Lessons Learned the Hard Way

Business Leadership

Paper No.	Author(s)	Paper Title (click for abstract)
BL-006	Kirk Paul Lafler	Differentiate Yourself
BL-066	Richann Watson & Louise Hadden	Are you Ready? Preparing and Planning to Make the Most of your Conference Experience
BL-068	Sreejita Biswas & Miriam McGaugh	Taxi Ride Prediction: Does the Yellow Cab Supply Meet Customer Demands?
BL-074	Prajakta Pai & Miriam McGaugh	Impact of Aging Population on Social Security and Underlying Trends
BL-076	Manasi Murde & Miriam McGaugh	Why Admitted Students Opt-out of College Enrollment?
BL-101	Steven Myers	Exploring and characterizing time series data in a non-regression based approach
BL-104	Steven Myers	Explore your data before you rush to analysis, you will thank me later: explorations in cross section data

Hands On Workshop

Paper No.	Author(s)	Paper Title (click for abstract)
HW-007	Kirk Paul Lafler	Powerful and "Sometimes" Hard-to-find PROC SQL Features
HW-020	Xiaoting Wu	Survival Tips for Survival Analysis
HW-022	Jayanth Iyengar	Understanding Administrative Healthcare Data sets using SAS programming tools
HW-037	Troy Hughes	Parallel Processing Your Way to Faster Software and a Big Fat Bonus: Demonstrations in Base SAS®
HW-050	Josh Horstman	Doing More with the SGPLOT Procedure
HW-063	Richann Watson & Kriss Harris	Interactive Graphs
HW-086	Kent Phelps & Ronda Phelps	Base SAS® & SAS® Enterprise Guide® Automate Your SAS® World with Dynamic Code ~ Forwards & Backwards
HW-094	Ben Cochran	Not Even One Single Solitary Semicolon: Powerful SAS Things You Can Do Without Writing Programs
HW-133	Amy Shi	Introduction to Bayesian Computation using SAS®

Industry

Paper No.	Author(s)	Paper Title (click for abstract)
IN-014	Doug Thompson	Comparison of three methods for transforming predictor variables to improve model fit using SAS
IN-015	Peter Flom	Scatterplots: Basics, enhancements, problems and solutions.
IN-019	Xiaoting Wu	Time-To-Event Analysis in the Presence of Competing risks
IN-021	Kirk Paul Lafler	Exploring the Skills Needed by the Data Science / Analytics Professional
IN-036	Troy Hughes	From FREQing Slow to FREQing Fast: Facilitating a Four-Times-Faster FREQ with Divide-and-Conquer Parallel Processing
IN-044	Andrea McLain	The SAS EG Process Flow: A Customizable Data Mining Tool in the Search for Healthcare Fraud
IN-061	Lynn (Xiaohong) Liu & Roderick Jones	Leveraging RAREEVENTS Procedure Options to Monitor and Evaluate Infrequent Events in Healthcare
IN-064	Thu Dinh	Detecting Side Effects and Evaluating Effectiveness of Drugs from Customers' Online Reviews using Text Analytics and Data Mining Models
IN-080	Lin Qi et al.	Do undergraduates need prerequisites for common courses?
IN-083	Sai Gopi Krishna Govindarajula	Classifying Risk in Life Insurance using Predictive Analytics
IN-088	Nancy Brucken	Timing is Everything: Defining ADaM Period, Subperiod and Phase
IN-095	Michael Wise & Soumya Rajesh	Findings About: De-mystifying the When and How
IN-113	Michael G. Wilson	Sample Size and Design Considerations in Studies Assessing Non-Inferiority using Continuous Outcomes
IN-116	Zhixin Lun & Ravindra Khattree	Simulating Skewed Multivariate Distributions Using SAS: Cases of Lomax, Mardia's Pareto (Type I), Logistic, Burr and F Distributions
IN-130	David Olaleye	Real World Evidence and Population Health Analytics: Intersection and Application Outside

Rapid Fire

Paper No.	Author(s)	Paper Title (click for abstract)
RF-003	Kirk Paul Lafler	A Visual Step-by-step Approach to Converting an RTF File to an Excel File
RF-004	Kirk Paul Lafler	Saving and Restoring Startup (Initialized) SAS® System Options
RF-017	Brooke Ellen Delgoffe	3 ways to get Pretty Excel-Style Tables: PROC REPORT, PROC TABULATE, and Help from SAS Enterprise Guide®
RF-024	Ruth Kurtycz	Navigating the Ins and Outs of Mapping in SAS Enterprise Guide
RF-029	Michael Harper	Using the XLSX libref engine with metadata available in Dictionary Tables
RF-033	Troy Hughes	The Doctor Ordered a Prescription&Not a Description: Driving Dynamic Data Governance Through Prescriptive Data Dictionaries That Automate Quality Control and Exception Reporting
RF-034	Troy Hughes	Abstracting and Automating Hierarchical Data Models: Leveraging the SAS® FORMAT Procedure CNTLIN Option To Build Dynamic Formats That Clean, Convert, and Categorize Data
RF-038	Robert G. Downer	Evaluate your SCORE: Logistic regression prediction comparison using the SCORE statement
RF-058	Louise Hadden	Like, Learn to Love SAS® Like
RF-060	Louise Hadden & Troy Hughes	DOMinate your ODS Output with PROC TEMPLATE, ODS Cascading Style Sheets (CCS), and the ODS Document Object Model (DOM)
RF-067	Raj Laxmi Prakash & Miriam McGaugh	Breaking Human Trafficking Network: An Analytics Approach
RF-069	Sai Teja Sagi & Miriam McGaugh	The Advent of Renewable Energy
RF-077	Kathryn Schurr	Utilizing Macros to Create Patient Site Matching via Zip-Code Radiuses
RF-079	Harish Reddy Patlolla & Miriam McGaugh	US Airline Passenger Satisfaction using SAS Enterprise Miner
RF-100	Katelyn Ware & Rachel Baxter	Surviving Survival Analysis 101: Making the Likelihood Ratio Test Easier Using a Macro
RF-105	Laurie Smith	Comparing Dates without an Array
RF-108	Stephanie Thompson	What Not to Do in a Program Used with %include
RF-109	Stephanie Thompson	10 Cool Things You Can Do in a DATA STEP

SAS 101 Plus

Paper No.	Author(s)	Paper Title (click for abstract)
SP-002	Kirk Paul Lafler	SAS® Macro Programming Tips and Techniques
SP-012	Zeke Torres	PROC FORMAT with HTML - for useful Drill Down output in Web and/or Excel
SP-026	Bruce Lund	Logistic Regression, Basics and Beyond
SP-031	Troy Hughes	User-Defined Multithreading with the SAS® DS2 Procedure: Performance Testing DS2 Against Functionally Equivalent DATA Steps
SP-043	Jayanth Iyengar & Josh Horstman	Look up not down: Advanced Table Lookup Techinques in BASE SAS
SP-052	Josh Horstman	Fifteen Functions to Supercharge Your SAS® Code
SP-053	Josh Horstman	Using Macro Variable Lists to Create Dynamic Data-Driven Programs
SP-057	Louise Hadden	Using ODS Trace (DOM), Procedural Output and ODS Output Objects to Create the Output of Your Dreams
SP-065	Richann Watson & Louise Hadden	Quick, Call the "FUZZ": Using Fuzzy Logic
SP-072	LeRoy Bessler	Powerful SAS® Output Delivery with ODS EXCEL
SP-093	Ben Cochran	Urge to Merge? Maybe You Should Update Instead.
SP-131	Jane Eslinger	It's All About the Base - Procedures

SAS Super Demo

Paper No.	Author(s)	Paper Title (click for abstract)
SD-117	Chevell Parker	What's New in the ODS Excel Destination
SD-118	Chevell Parker	Generating Pivot Tables Using ODS Markup
SD-119	Margaret Crevar	Important Performance Considerations When Moving SAS® to a Public Cloud
SD-120	David Olaleye	SAS Real World Evidence
SD-121	Jane Eslinger	Using ODS LAYOUT to Align Text and Graphs in PDF
SD-122	Jane Eslinger	Why Learn CASL as a SAS Programmer?
SD-123	Amy Shi	What's New in SAS/STAT 15.1
SD-124	Charu Shankar	Sandwich your SAS dataset to Excel Pivot Tables
SD-125	Charu Shankar	DS2 constructs
SD-126	Charu Shankar	Playtime in the hadoop zoo

e-Poster

Paper No.	Author(s)	Paper Title (click for abstract)
PO-023	Derek Grittmann & Adam Hendricks	Creating a True LSF Batch Job Submission Capability on SAS EG in a SAS Grid
PO-028	Jose Centeno	Generating SAS Datasets from ASCII Files Using a Crosswalk
PO-030	Troy Hughes	Badge in Batch with Honeybadger: Generating Conference Badges with Quick Response (QR) Codes Containing Virtual Contact Cards (vCards) for Automatic Smart Phone Contact List Upload
PO-045	Mario Tejada	Have Your SAS Program and Schedule It Too!
PO-071	Hai Nguyen	Frequency matching case-control techniques: an epidemiological perspective
PO-082	Ted Conway	Oh, There's No Place Like SAS ODS Graphics for the Holidays!
PO-097	Abigail Zysk & Kylie Springer	Exploring Wine Reviews: How Language and Word Use Varies in Wine Reviews
PO-098	Anne Cain-Nielsen & Scott Regenbogen	Profiling hospital length of stay using the mode

Abstracts

Analytics

AL-041 : No Wrong Door
Jack Shoemaker, Medical Home Network
Monday, 11:00 AM - 11:50 AM, Location: Crystal C

This paper presents a case study of a recent SAS installation on a virgin server. The philosophy was to create an environment with no wrong door. That means' no wrong door in terms of getting into and using SAS as well as no wrong door in terms of publishing information. Topics covered will include Linux, SAS Studio, SAS\Enterprise Guide, Jupyter Notebooks, Python, REST APIs and general interoperability with other resources on the web.

AL-047 : Connecting to Datasets through Python and SAS
Joe Matise, NORC
Tuesday, 2:30 PM - 3:20 PM, Location: Crystal C

Using SAS and Python together has never been easier. Thanks to Jupyter notebooks and SASPY, it is not only possible but very easy to run SAS code alongside Python. Using Python, it is possible to use open source code to easily access data made available by different data vendors on the web, and then pull that into SAS. In this paper we will discuss setting up the connection between Python and SAS, setting up a Jupyter Notebook using SAS and/or Python, and using Python and SAS together to access data from several web APIs.

AL-059 : Track SAS Viya user logins to identify usage and drive improvement using actionable Intelligence: navigate visual reports, track KPIs and visualize long term trends.
Anurag Kulkarni, Revspring
Tuesday, 11:00 AM - 11:20 AM, Location: Crystal C

Do you want to track how your users are taking advantage of tools within SAS Viya? Do you offer reports to your clients and you want to analyze how the reports are being used to drive improvement? This paper will walk you through how to archive audit logs data in SAS Viya (since SAS Viya retains only 7 days of audit logs) and how to use Report Designer to build reports with actionable intelligence around the audit logs data. This paper uses several tools in SAS Viya 3.4 (Linux environment), so a basic understanding of the following tools is helpful but not required: Report Viewer, Report Designer/Visual Analytics, Environment Manager, SAS Job Execution and SAS Studio. The paper applies to all versions of SAS Viya that support audit logs.

AL-078 : Machine Learning? The Machine Already Knows. Software-Intelligent Application Development Provides Reliability, Reusability, Extendability, and Maintainability for Strong Smart Systems in an Ever-Changing World
LeRoy Bessler, Bessler Consulting and Research
Monday, 4:00 PM - 4:50 PM, Location: Crystal C

Applications designed and built with Software Intelligence (SI) are robust, made of reusable parts, and easy and quick to extend or maintain. With dynamically auto-customizing code, such "living" applications go beyond change tolerance to change amenability, and further-to change implementation. They cope with ever-changing user or management preferences, run-dates, data dates, and data content. Common types of changes in report/graph content, format, and function are handled without reprogramming. If business rules do not change, an SI application can have eternal life. Whether you are a new or experienced SAS programmer, or an analytically oriented user who does not think of herself or himself as a programmer at all, this paper-which assumes no advanced SAS knowledge-shows you how to apply principles of Software-Intelligent Application Development, which are really programming-language-independent, to make your use of SAS software safe, simple, and speedy. One of the tools for SI implementation with SAS software is SAS macro language. For users with no SAS macro language experience, the standup presentation includes a brief, but sufficient, introduction. Besides explaining the few, but powerful, principles of Software-Intelligent application development, the paper provides you with some widely applicable practical examples that you can put to use back at work.

AL-111 : Legislative Action on Human Trafficking: Towards a Data-Driven Policy
David Corliss, Peace-Work
Monday, 10:00 AM - 10:50 AM, Location: Crystal C

The effectiveness of state legislative actions combating human trafficking is assessed by comparing reported trafficking rates to predicted actual levels. An earlier mixed model by the authors identified key drivers of human trafficking using victim counts reported to the National Human Trafficking Hotline from Polaris, an NGO combating human trafficking. State-level legislative variations, which affect reporting but not the underlying true level of this economic crime, constitute a random effect. The regression equation arising from the fixed effects provides a relative estimate of trafficking at a state level. A regression model estimates the effectiveness of twelve different types of legislative action, with the percent difference between reported and modeled true rates as the outcome.

AL-114 : Deploying SAS Viya Docker images to the cloud - a step by step guide
Alan Zablocki, RedMane Technology
Monday, 9:00 AM - 9:50 AM, Location: Crystal C

In this paper, we describe the steps to deploy a pre-built SAS Viya Docker image to the cloud. We use Microsoft Azure as the cloud environment to deploy our single programming-only SAS Viya image. We explain how to push massive Docker images to a private Azure Container Registry (ACR) and how to launch Azure Container Instances (ACI) to host Docker images. We also show how to share code and data between Azure Storage and the SAS Viya Docker image, providing a complete data science working environment. Finally, we demonstrate how to use our SAS Viya environment for machine learning with code examples using SASPy, SAS Scripting Wrapper for Analytics Transfer (SWAT), TensorFlow and R.

AL-115 : Deploying SAS Viya to Docker - a practical guide for data scientists
Alan Zablocki, RedMane Technology
Tuesday, 1:30 PM - 2:20 PM, Location: Crystal C

Recently, SAS has provided the ability to package and deploy SAS Viya software in Docker containers. Users with a valid SAS Viya order have a choice between using the Docker image repository hosted by SAS or manually building the Docker image using SAS container recipes, an open source GitHub project. In this paper, we use the SAS container recipes to build a customized SAS Viya Docker image with access to a range of licensed SAS products such as SAS Visual Statistics and SAS Visual Data Mining and Machine Learning (VDMML). We guide the reader through the entire process of building and deploying a local SAS Viya Docker image, from configuring local storage and data persistence, to adding support for Jupyter Notebook, Python and R.

AL-127 : Why choose between SAS Data Step and PROC SQL when you can have both?
Charu Shankar, SAS
Monday, 3:30 PM - 3:50 PM, Location: Crystal C

As a SAS coder, you've often wondered what the SQL buzz is about. Or vice versa you breathe SQL & don't have time for SAS. Learn where the SAS data step has a distinct advantage over SQL. Learn where you just can't beat SQL. Learn which tool will better help you in 1. Reading raw data 2. Joining data 3. Accumulating data 4. Aggregating data 5. Managing data

AL-128 : Stacking Up - Horizontal or Vertical with PROC SQL or DATA Step
Charu Shankar, SAS
Tuesday, 10:00 AM - 10:50 AM, Location: Crystal C

Joins or set operations - come to this session to learn the huge power of SQL joins, the flexibility, and short code writing. See the four different joins, both ways (SQL vs Data Step merge). Inner Join; Outer Join - left, right, full; Learn how the data step operates when you merge data & prerequisites for the data step as well as when to use it; Learn how PROC SQL stacks data vertically with classic set operations; Learn how the DATA Step stacks data vertically and its flexibility; With visuals included to apply knowledge transfer, come to this powerful session to take away the best way to stack data vertically or horizontally and when to use which Base SAS tool.

AL-129 : Power Up Your Reporting Using the SAS® Output Delivery System
Chevell Parker, SAS
Monday, 1:30 PM - 2:20 PM, Location: Crystal C

Making sense of a large amount of data is one of the most important aspects of a reporting system. Reporting helps you and others in your organization discover important insights into trends, business strengths and weaknesses, and the overall health of a company. Therefore, report output should be in a format that is easily understood by anyone. To create such output, you need to use the correct reporting tools. This paper, written for data analysts, discusses techniques to power up (amplify) the effectiveness of your reporting. These techniques use SAS® Output Deliver System (ODS) destinations (especially the ODS Excel destination) to generate functional, presentation-ready Microsoft Excel worksheets. The discussion also explains how to use the ODS destinations to enhance web pages and other types of documents. Finally, the paper explains how you can use Python open-source software with the SAS® System and ODS destinations to further enhance your reporting.

AL-132 : Gaining Comfort With SAS® Grid Computing for Analysts and Data Scientists
Margaret Crevar, SAS
Tuesday, 9:00 AM - 9:50 AM, Location: Crystal C

Learn how to exploit efficiencies and gain faster performance for your programs running on SAS Grid Computing.

AL-134 : Top SAS® Architecture Pitfalls: Lessons Learned the Hard Way
Margaret Crevar, SAS
Monday, 2:30 PM - 3:20 PM, Location: Crystal C

Are you a SAS® infrastructure architect? Do you want to know what has worked for other SAS customers and what things should be avoided? Attend this presentation to learn from others' experiences with SAS. We also encourage you to share your experiences with us.

Business Leadership

BL-006 : Differentiate Yourself
Kirk Paul Lafler, Software Intelligence Corporation
Tuesday, 1:30 PM - 2:20 PM, Location: Wrigley

Today's job, employment, contracting, and consulting marketplace is highly competitive. As a result, SAS® professionals should do everything they can to differentiate and prepare themselves for the global marketplace by acquiring and enhancing their technical and soft skills. Topics include describing how SAS professionals should assess and enhance their existing skills using an assortment of valuable, and "free", SAS-related content; become involved, volunteer, publish, and speak at in-house, local, regional and international SAS user group meetings and conferences; and publish blog posts, videos, articles, and PDF "white" papers to share knowledge and differentiate themselves from the competition.

BL-066 : Are you Ready? Preparing and Planning to Make the Most of your Conference Experience
Richann Watson, DataRich Consulting
Louise Hadden, Abt Associates Inc.
Tuesday, 9:00 AM - 9:50 AM, Location: Wrigley

Whether you are first time conference attendee or a long-time conference attendee, this paper can help you in getting the most out of your conference experience. As long-time conference attendees and volunteers, we have found that there are some things that people just don't think about when planning their conference attendance. In this paper we will discuss helpful tips such as making the appropriate travel arrangements, what to bring, networking and meeting up with friends and colleagues, and how to prepare for your role at the conference. We will also discuss maintaining a workplace presence with your paying job while at the conference.

BL-068 : Taxi Ride Prediction: Does the Yellow Cab Supply Meet Customer Demands?
Sreejita Biswas, Oklahoma State University, Stillwater
Miriam McGaugh, Oklahoma State University
Tuesday, 11:30 AM - 11:50 AM, Location: Wrigley

New York is the taxi capital of America and home to the classic yellow taxicabs. It would be beneficial to taxi companies and customers if rides are available whenever a customer is in need of one. To achieve this level of service, it is important to know how different factors affect the number of rides. This process gets complicated due to the effects of external forces such as weather. Due to pricing strategies employed by other cab companies such as surge pricing, customers are always on the hunt for affordably priced rides at the time of their need. This paper attempts to predict the demand for a yellow taxi at a particular location, on a particular day and at a particular time. This will help to estimate the number of taxis that should be present at any given time or place. This project focuses on NY Yellow Taxis dispatched from a central facility in 2018. This paper will help to understand and predict the demand and supply of yellow taxicabs and help to improve the process for both, customer satisfaction and the taxi company industry. Six months' worth data (Jan'18 - June'18) from the New York City Taxi and Limousine Commission and from Weather Underground were obtained including information such as pick-up/ drop-off locations, time/ date, distance, payment source, temperature, wind speed, and precipitation levels. Multiple models will be built to predict the demand and supply using SAS® Forecast Studio due to the variations in the weather over six months.

BL-074 : Impact of Aging Population on Social Security and Underlying Trends
Prajakta Pai, Oklahoma State University
Miriam McGaugh, Oklahoma State University
Tuesday, 2:30 PM - 2:50 PM, Location: Wrigley

For years Social Security has been a major source of income for retired individuals and their families. Social Security was developed as an anti-poverty program that focuses on retired workers, disabled individuals and survivors of workers. Over the years, factors such as improved longevity, education and retirement of baby boomers has resulted in strain on the Social Security reserves. Research suggests the Social Security benefits will soon be reducing due to more beneficiaries within the system. Currently, over 61 million beneficiaries are paid each month. However, with the retirement of the baby boomer population, the Social Security funds will see an increase in payouts and a decrease of income. It is imperative to look at the trends in population versus individuals who would be availing Social Security upon retirement to establish a trend and identify factors that may contribute towards a deficit or surplus in Social Security funds for the coming years. This project examined the trends in Social Security beneficiaries and the aging population, while also looking at trends among the non-resident population in each state. Based on the aforementioned data, we will investigate if change in the population of non-resident workers has an effect on the Social Security funds in a specific region of the country. SAS Enterprise Miner will be used to explore the US Census and Social Security Administration data. The expectation is that the population of non-resident workers will contribute towards increasing Social Security funds and reduce the disparity from the aging population.

BL-076 : Why Admitted Students Opt-out of College Enrollment?
Manasi Murde, Oklahoma State University
Miriam McGaugh, Oklahoma State University
Tuesday, 11:00 AM - 11:20 AM, Location: Wrigley

Understanding and improving student enrollment has always been important for departments across universities. It is imperative for universities to have a firm grip on addressing enrollment of prospects, applicants and admitted students. As per research, one out of every three students who are admitted do not enroll in that college. One of the prime reasons why universities must concentrate on increasing their enrollment rate is to avoid loss of revenue. Admitted but non-enrolled students have a negative financial impact on a university due to the loss of tuition fees while expending money on marketing and recruitment. Non-enrollment of students also results in loss of academic potential. There can be numerous reasons why admitted students fail to enroll within a particular university, which may be specific to one university or common among many schools. Therefore, each school must identify any existing trends within its own unique admitted non-enrolled student population base. This research concentrates on analyzing data to identify various factors and indicators for why students fail to enroll after accepted admission to Oklahoma State University. The research will also focus on predicting non-enrollment based on those factors. The data for this analysis includes email communications, text messages, demographic, and admission process timestamps. Preliminary analysis indicates that non-enrollment is higher among first generation students and academically talented students. Results from this research will help the Marketing Department of Oklahoma State University improve undergraduate enrollment by understanding non-enrollment rate across demographics and communications/factors that lead to non-enrollment.

BL-101 : Exploring and characterizing time series data in a non-regression based approach
Steven Myers, The University of Akron
Tuesday, 3:00 PM - 3:50 PM, Location: Wrigley

Business leaders as well as data analysts and data scientists need to have an understanding of the particularities of time series data. This paper reports on an introduction to time series as taught to students in a first business analytics course making use of data from FRED, the marvelous time series repository at the St. Louis Federal Bank. Students are cautioned not to run to advance techniques before stopping to fully explore the data and this approach is designed to instill a EDA mentality into the students while teaching them how to manipulate and characterize time series data in SAS and thereby set the ground work for more advanced work in time-series econometrics, forecasting and predictive analytics. Also, instilled in the students is an appreciation of knowing the data generating process. SAS programming is taught through this approach focusing on SAS functions such as DIF and LAG, Procs CORR, MEANS, and SGPLOT. The paper concludes with a basic coverage of random walk and spurious correlation that can easily result in economic time series data when one does not first investigate data stationarity.

BL-104 : Explore your data before you rush to analysis, you will thank me later: explorations in cross section data
Steven Myers, The University of Akron
Tuesday, 10:00 AM - 10:50 AM, Location: Wrigley

Economists, business leaders and analysts spend a great deal of time analyzing structured cross sectional data. This paper is an introduction to exploratory data analysis for economic and business data analytic students in an introductory course in economics to teach data handling and SAS programming and features procs MEANS, SGPLOT, FREQ, CORR, REG, and TABULATE. A dataset on rents paid is used to illustrate the solution of the problem: do women pay higher rents on a college campus? Important to learn all you can about your data before rushing to analysis, yet students typically rush to more advanced and fancier techniques. In this paper we show how to ground the analysis in a firm understanding the data generating process and suggest many ways to learn about the under lying data. Additional data is introduced to illustrate the problems of data cleaning and manipulations in large samples that illustrate the effect of economic freedom on standards of living worldwide and concludes with the steps in the process to reveal causal patterns in that data. The experience of students is highlighted.

Hands On Workshop

HW-007 : Powerful and "Sometimes" Hard-to-find PROC SQL Features
Kirk Paul Lafler, Software Intelligence Corporation
Monday, 9:00 AM - 10:20 AM, Location: SAS Training Center

The SQL Procedure contains many powerful and elegant language features for intermediate and advanced SQL users. This hands-on workshop presents topics that will help SAS users unlock the many powerful features, options, and other gems found in the SQL universe. Topics include using CASE logic to assign new values; a sampling of summary (statistical) functions; identifying FIRST, LAST, and BETWEEN rows in By-groups; access metadata from Dictionary tables; create single-value and value-list macro variables using the PROC SQL and macro interface; perform two table joins and discuss the four available join algorithms; and use PROC SQL statement options _METHOD, MAGIC=101, MAGIC=102, and MAGIC=103 to better understand what the SQL optimizer does with a query.

HW-020 : Survival Tips for Survival Analysis
Xiaoting Wu, University of Michigan
Monday, 3:00 PM - 3:50 PM, Location: SAS Training Center

Survival analysis is a common type of analysis in health care field. This hand-on tour will convey some survival tips during survival analysis. We will provide an overview on the survival analysis using SAS LIFETEST and PHREG procedures, including data preparation and visualization, variable selection, model specification, model validation and output interpretation. We will also showcase some advanced application on how to obtain prediction estimates, customize and output plots from SAS PHREG procedure.

HW-022 : Understanding Administrative Healthcare Data sets using SAS programming tools
Jayanth Iyengar, Data Systems Consultants LLC
Monday, 10:30 AM - 11:50 AM, Location: SAS Training Center

Changes in the healthcare industry have highlighted the importance of healthcare data. The volume of healthcare data collected by healthcare institutions, such as providers, and insurance companies is massive, and growing exponentially. SAS programmers need to understand the nuances and complexities of healthcare data structures to perform their responsibilities. There are various types and sources of Administrative Healthcare data, which include Healthcare Claims (Medicare, Commercial Insurance, & Pharmacy), Hospital Inpatient, and Hospital Outpatient. This training seminar will give attendees an overview and detailed explanation of the different types of healthcare data, and the SAS programming constructs to work with them. The workshop will engage attendees with a series of SAS exercises using healthcare datasets.

HW-037 : Parallel Processing Your Way to Faster Software and a Big Fat Bonus: Demonstrations in Base SAS®
Troy Hughes, Datmesis Analytics
Monday, 4:00 PM - 4:50 PM, Location: SAS Training Center

SAS® software and especially extract-transform-load (ETL) systems commonly include components that must be serialized due to real process dependencies. For example, a transform module often cannot begin until the data extraction completes, and a corresponding load module cannot begin until the data transformation completes; thus, the E, T, and L must occur in sequence. Although process dependencies such as these cannot be avoided in many cases and necessitate serialized software design, in other cases, programs or data can be distributed across two or more SAS sessions to be processed in parallel, facilitating significantly faster software. This text introduces the concept of false dependencies, in which software is serialized by (poor) design rather than necessity, thus needlessly increasing execution time and deprecating performance. Three types of false dependencies are demonstrated as well as distributed software solutions that eliminate false dependencies through parallel processing, arming SAS practitioners to accelerate both their software and salaries.

HW-050 : Doing More with the SGPLOT Procedure
Josh Horstman, Nested Loop Consulting
Tuesday, 9:00 AM - 10:20 AM, Location: Picasso

Once you've mastered the fundamentals of using the SGPLOT procedure to generate high-quality graphics, you'll certainly want to delve in to the extensive array of customizations available. This workshop will move beyond the basic techniques covered in the introductory workshop. We'll go through more complex examples such as combining multiple plots, modifying various plot attributes, customizing legends, and adding axis tables.

HW-063 : Interactive Graphs
Richann Watson, DataRich Consulting
Kriss Harris, SAS Specialists Ltd
Tuesday, 2:30 PM - 3:20 PM, Location: Picasso

This paper demonstrates how you can use interactive graphics in SAS® 9.4 to assess and report your safety data. The interactive visualizations that you will be shown include the adverse event and laboratory results. In addition, you will be shown how to display "details-on-demand" when you hover over a point. Adding interactivity to your graphs will bring your data to life and help improve lives!

HW-086 : Base SAS® & SAS® Enterprise Guide® Automate Your SAS® World with Dynamic Code ~ Forwards & Backwards
Kent Phelps, Illuminator Coaching, Inc.
Ronda Phelps, Illuminator Coaching, Inc.
Tuesday, 10:30 AM - 11:50 AM, Location: Picasso

Communication is the basic foundation of all relationships, including our SAS relationship with the server, PC, or mainframe. To communicate more efficiently ~ and to increasingly automate your SAS world ~ you will want to learn how to transform static code into dynamic code that automatically re-creates the static code, and then executes the re-created static code automatically. Our presentation highlights the powerful partnership that occurs when dynamic code is creatively combined with a dynamic FILENAME statement, macro variables, the SET INDSNAME option, and the CALL EXECUTE command within one SAS Enterprise Guide Program node. You have the exciting opportunity to learn how to design dynamic code forwards and backwards to re-create static code while automatically changing the year as 1,574 time-consuming manual steps are amazingly replaced with only one time-saving dynamic automated step. We invite you to attend our Dynamic Code Presentation, in which we detail the UNIX and Microsoft Windows syntax for our project example and introduce you to your newest BFF (Best Friend Forever) in SAS. Please see the appendixes to review additional starting-point information about the syntax for IBM z/OS, and to review the source code that created the data sets for our project example.

HW-094 : Not Even One Single Solitary Semicolon: Powerful SAS Things You Can Do Without Writing Programs
Ben Cochran, The Bedford Group
Tuesday, 1:30 PM - 2:20 PM, Location: Picasso

This presentation starts by illustrating converting different kinds of data into SAS datasets. Specifically, Excel spreadsheets and Microsoft Access tables are converted into SAS data. Then, these two different data sources are joined with an existing SAS data set. Finally, a series of graphical and tabular reports are generated from this combined data. All of these tasks are completed without writing any SAS programs.

HW-133 : Introduction to Bayesian Computation using SAS®
Amy Shi, SAS
Monday, 1:00 PM - 2:50 PM, Location: SAS Training Center

This tutorial reviews the basic concepts of Bayesian inference and introduces Bayesian computation in SAS. The objectives are to familiarize statistical programmers and practitioners with the essentials of Bayesian computing, and to equip them with computational tools through a series of worked-out examples that demonstrate sound practices for a variety of statistical models and Bayesian concepts. The first part of the tutorial introduces Bayesian inference, covers the fundamentals of prior distributions, concepts in estimation, regression models (with an emphasize on hierarchical models). The tutorial will also cover computational methods and convergence diagnostics. The second part of the tutorial discusses applications using Bayesian capabilities in SAS/STAT software in the MCMC and BGLIMM procedures. Examples will include methods such as linear regression, generalized linear models, hierarchical models, posterior prediction, use of historical information, and missing data problems.

Industry

IN-014 : Comparison of three methods for transforming predictor variables to improve model fit using SAS
Doug Thompson, Rush Health
Monday, 10:00 AM - 10:50 AM, Location: Gold Coast

Continuous and ordinal predictor variables are common in predictive modeling (e.g., age in years, medical expenditures last year). Often, such variables are non-linearly related to the predictive modeling target. To maximize the accuracy of a predictive model, non-linear associations need to be taken into account and included in the final model when appropriate. There seems to be no consensus on how best to detect and quantify non-linear associations when building predictive models. Several methods have been proposed in the literature, including cubic splines and exploring a wide variety of functional forms then selecting the best-fitting via stepwise techniques. Although multivariate adaptive regression splines (MARS) and similar methods are often viewed as a stand-alone technique for predictive modeling, these techniques could also be used for exploring non-linear associations that are then included in a final model constructed using some other modeling technique (e.g., logistic regression or neural networks). The purpose of this paper is to illustrate three possible methods for exploring non-linear associations in predictive modeling using SAS: Cubic splines, MARS, and stepwise selection of the best fitting of exploratory functional forms. A SAS macro is described, facilitating easy implementation and evaluation of each of these techniques. The techniques illustrated require only SAS/STAT (particularly PROCs ADAPTIVEREG, LOGISTIC and SGPLOT). The audience is assumed to have intermediate familiarity with predictive modeling. The techniques are illustrated in a context that is common and important within the healthcare industry: Predicting which patients will have relatively high healthcare expenditures next year.

IN-015 : Scatterplots: Basics, enhancements, problems and solutions.
Peter Flom, Peter Flom Consulting
Tuesday, 2:30 PM - 2:50 PM, Location: Gold Coast

The scatter plot is a basic tool for presenting information on two continuous variables. While the basic plot is good in many situations, enhancements can increase its utility. I also go over tools to deal with the problem of overplotting. SAS, any operating system or version, appropriate for all levels.

IN-019 : Time-To-Event Analysis in the Presence of Competing risks
Xiaoting Wu, University of Michigan
Tuesday, 11:00 AM - 11:20 AM, Location: Gold Coast

Competing risks are common phenomena in time-to-event analysis. A competing risk may take place before the event of interest thus exclude the possibility of event occurrence. For example, in the study of artificial heart valve duration, death is a competing risk as it modifies a patient's chance to receive potential reoperation due to valve deterioration. Ignoring competing risks, for example, the use of standard Kaplan-Meier estimators, will result in biased estimates for the event of interest. Cumulative incidence function that estimates the probability of event of interest over time, and cause-specific hazard function that models the effect of covariates on the event of interest, are two main approaches to perform time-to-event analysis in the presence of competing risk. This paper demonstrates the rational, implementation and interpretation of these methods, with SAS applications using SAS macro % CIF, LIFETEST and PHREG procedure.

IN-021 : Exploring the Skills Needed by the Data Science / Analytics Professional
Kirk Paul Lafler, Software Intelligence Corporation
Tuesday, 3:00 PM - 3:50 PM, Location: Gold Coast

As 2.5 quintillion bytes (1 with 18 zeros) of new data are created each and every day, the age of big data has taken on new meaning with a renewed sense of urgency to prepare students, young professionals, and other workers across job functions for todays and tomorrows analytics-roles along with the necessary analytical skills to tackle growing data demands. With the number of organizations embracing Data Science / Analytics skills and tools, organizations like LinkedIn, a leading professional networking and employment-oriented website and app, found that Data Scientists saw a 56% increase in the US job market in 2018. To keep up with the huge demand for analytics talent in 2019 and beyond, many colleges, Universities, and training organizations offer comprehensive Data Science / Analytics degrees and certificate programs to fulfill the increasing demand for analytical skills. This presentation explores the skills needed by the Data Science / Analytics professional including critical thinking; statistical programming languages such as SAS®, R or Python; Structured Query Language (SQL); Microsoft Excel; and data visualization.

IN-036 : From FREQing Slow to FREQing Fast: Facilitating a Four-Times-Faster FREQ with Divide-and-Conquer Parallel Processing
Troy Hughes, Datmesis Analytics
Monday, 3:00 PM - 3:20 PM, Location: Gold Coast

With great fanfare, the release of SAS® 9 delivered multithreaded processing to a single-threaded SAS world. Procedures such as SORT, SQL, and MEANS could now run faster by taking advantage more fully of system resources through parallel processing paradigms. Multithreading commonly implements divide-and-conquer methodologies in which data sets or data streams are decomposed into subsets and processed in parallel rather than in series. Multithreaded solutions are faster (but typically not more efficient) than their single-threaded counterparts because execution time (but not system resource utilization) is decreased. As the costs of memory and processing power have continued to decrease, however, there remains no excuse for not implementing multithreaded processing wherever possible. To this end, and because SAS unfortunately abandoned some hapless procedures in single-threaded Sheol, this text aims to reunite the single-threaded FREQ procedure with its multithreaded bedfellows. The FREQFAST macro is introduced and espouses divide-and-conquer parallel processing that performs a frequency analysis more than four times faster than the out-of-the-box FREQ procedure. Non-environmental factors affecting FREQ performance (e.g., number of observations, number of unique observations, file size) are elucidated and modeled to demonstrate and predict performance improvement delivered through FREQFAST.

IN-044 : The SAS EG Process Flow: A Customizable Data Mining Tool in the Search for Healthcare Fraud
Andrea McLain, Cigna Health Insurance
Monday, 11:30 AM - 11:50 AM, Location: Gold Coast

It is estimated that tens of billions of dollars are lost each year in fraudulent healthcare insurance claims. The implications of this go way beyond financial losses and higher insurance premiums. For instance, many fraud schemes could result in patient exploitation or harm, or the illicit gains could be used in the furtherance of other criminal activities. Health insurance companies utilize data mining and predictive analytics to identify potentially fraudulent claims. Many third party companies create products just for this very purpose, where algorithms are used to flag claims exhibiting some known fraudulent pattern. Products built by these companies are exceptionally helpful in identifying and ultimately stopping and preventing insurance fraud, but many situations call for more, and a means beyond pre-built algorithms are necessary. This presentation is about one such instance, where a creative, on-the-fly, data mining process was built within a SAS project to identify potential health insurance fraud in natural disaster scenarios, such as a hurricane or large-scale wildfire. This presentation will detail how an analyst started with millions of insurance claims and then utilized very simplistic analytical methods within a SAS project to generate a small list of potentially fraudulent healthcare providers who billed for services they likely could not have rendered due to circumstances surrounding a natural disaster. The SAS skills required to create this process were basic, but speak to the larger concepts of intelligence analysis and data mining in the identification of a criminal pattern.

IN-061 : Leveraging RAREEVENTS Procedure Options to Monitor and Evaluate Infrequent Events in Healthcare
Lynn (Xiaohong) Liu, Ann & Robert H. Lurie Childrens Hospital of Chicago
Roderick Jones, Ann & Robert H. Lurie Children's Hospital of Chicago
Monday, 2:00 PM - 2:20 PM, Location: Gold Coast

In healthcare, the purpose of statistical process control (SPC) is often to quantify improvements and identify unintended consequences resulting from an intentional change in an environment, policy, treatment protocol, or decision-support tool. The RAREEVENTS procedure got the acceptance in health care quality improvement applications due to its suitability for infrequent, low-probability events. The enhancements of the procedure in SAS/QC version 15.1 allows users to apply tests to detect special cause variations. We provide examples from healthcare, representing the geometric and exponential distributions, to describe approaches leveraging the RAREEVENTS procedure options, including READPHASE=, READINDEXES=, PHASEREF, PHASELEGEND, TESTS=, TESTACROSS and TESTOVERLAP.

IN-064 : Detecting Side Effects and Evaluating Effectiveness of Drugs from Customers' Online Reviews using Text Analytics and Data Mining Models
Thu Dinh, Oklahoma State University
Monday, 2:30 PM - 2:50 PM, Location: Gold Coast

Drug reviews play a very important role in providing crucial medical care information for both healthcare professionals and consumers. Customers are increasingly utilizing online review sites, discussion boards and forums to voice their opinions and express their sentiments about experienced drugs. However, a potential buyer would find it almost impossible to review all of these online comments before making a purchase decision. Another big challenge would be the unstructured, qualitative, and textual nature of the reviews, which makes it difficult for readers to classify the comments into meaningful insights. The aim of the present paper is to identify a data-mining model to evaluate the effectiveness and detect potential side effects from online customer reviews on specific prescriptive drugs. This study utilizes text parsing, text filtering, text clustering within SAS® Enterprise Miner" 14.3 for feature engineering and SAS® Sentiment Analysis Studio 12.2.5 for sentiment analysis. Further, multiple machine learning models including logistic regression, decision tree and neural network are employed to identify an optimal model. The study's preliminary results show that the best predictive model for side effect detection is Neural Network with a validation misclassification rate of 23.4% and a sensitivity rate of 68.5%. Regarding effectiveness classification, neural network model also works best with 18.2% validation misclassification rate and 91.6% sensitivity rate. These models will be further improved and the information will be employed to evaluate model performance and validity. The results can help as practical guidelines and useful references for prospective patients in making better informed purchase decisions.

IN-080 : Do undergraduates need prerequisites for common courses?
Lin Qi, Oklahoma State University
Archana Chinnaswamy, InterWorks.Inc
Miriam McGaugh, Oklahoma State University
Tuesday, 11:30 AM - 11:50 AM, Location: Gold Coast

A good design of common courses is the starting point for a business school student's academic success. The Oklahoma State University (OSU) Spears School of Business underwent a revision of their common core curriculum reducing the number to 10 core courses for all business school majors and none requires a prerequisite course for enrollment. Students may choose which course they want to take first. From a student's point of view, the lack of prerequisite allows the student more freedom in their schedule. However, enrolling in high-level courses before low-level courses may end up with more students getting a D, failing the course or withdrawing from the course (DFW). To test the value of requirement more structured schedule or potential prerequisite requirement, this paper is going to analyze the order effect of common courses on students' DFW rate for business common courses. About 5,000 students' demographic information and 50,000 common course enrollment outcomes datasets were supplied by OSU Institutional Research & Information Management department. This research used SAS Studio for data preparation and SAS Enterprise Miner for data modeling including decision tree, logistic regression, and neural network. Final model selection will be made according to the Average Square Error of models. The results show the decision tree as the best model with the variables significantly influencing DFW rates as high school core GPA and degree major. The influence of order is not significant. Undergraduates may not need prerequisite requirement to succeed in the business school common course.

IN-083 : Classifying Risk in Life Insurance using Predictive Analytics
Sai Gopi Krishna Govindarajula, Oklahoma State University
Tuesday, 9:30 AM - 9:50 AM, Location: Gold Coast

Ever wonder how many companies offer life insurance? There are more than 600 companies in the US alone offering life insurance policies. Insurance companies perform an underwriting process to assess the risk on life insurance applicants and then price the policies if approved. Those underwriters gather extensive information about applicants, which include extensive health histories, to classify risk profiles. The process of collecting existing data for the risk assessment, completing and obtaining any required patient health exams, and validating all the information often takes several weeks to months. In this fast-paced world, customers are prone to lose interest in finalizing policies from companies who take a prolonged time to evaluate an application. With the advent of data analytics, the underwriting process can be streamlined and completed much faster. The intention for this project was to build predictive models based on past customer history and to recommend the most appropriate model to assess risk resulting in better underwriting practices and customer retention. A real time data set having around 140 variables, which included a combination of categorical and continuous variables, was analyzed using SAS® Enterprise Miner" and Tableau® for predictive modeling and data visualization, respectively. Machine learning algorithms like Logistic regression, Neural network were implemented to assess risk and findings revealed that regression model has showed highest performance with a misclassification rate of 21.09%.

IN-088 : Timing is Everything: Defining ADaM Period, Subperiod and Phase
Nancy Brucken, Syneos Health
Tuesday, 10:30 AM - 10:50 AM, Location: Gold Coast

The CDISC Analysis Data Model Implementation Guide (ADaMIG) provides several timing variables for modeling clinical trial designs in analysis datasets. APHASE, APERIOD and ASPER can be used in conjunction with related treatment variables to meet a variety of analysis requirements, from single-period parallel studies to much more complicated situations involving multiple treatment periods and even different studies. The goal of this paper is to illustrate how some of these study designs may be handled in ADaM, and provide guidelines for selecting when to use the different timing variables that are available.

IN-095 : Findings About: De-mystifying the When and How
Michael Wise, Syneos Health
Soumya Rajesh, Syneos Health
Monday, 1:30 PM - 1:50 PM, Location: Gold Coast

CDISC offers Findings About (FA) and Supplemental Qualifiers (SuppQual) to handle information that doesn't fit into standard domains - or 'Non-standard variables'. They are however, quite distinct from each other and the appropriate use for each may still lead to confusion. "When should FA be created?" or "When is it best to use SUPPQUAL?" These are important questions that can only be answered by asking additional data questions. When the data does not fit into the parent domain, it may only be mapped to SUPPQUAL if it relates to one parent record. However, almost all other situations are covered by FA - wherein data relates to multiple records, or when a two-way relationship is needed etc. FA would be the right approach then, because it has versatility beyond what's offered by SUPPQUAL. For example, FA would provide a way of storing symptoms along with the time that they began and relating each back to the AETERM in the AE dataset. In addition, FA as a stand-alone domain is also the only place to store information surrounding an event or intervention that has not been captured within any specific domain. This paper will present examples from a few different therapeutic areas or domain relationships to highlight the proper use of FA. Another scenario will look into hoe FA accommodates a many to many relationship. Such examples should clarify mysteries surrounding when and how to best use or create FA.

IN-113 : Sample Size and Design Considerations in Studies Assessing Non-Inferiority using Continuous Outcomes
Michael G. Wilson, IUSM
Tuesday, 1:30 PM - 2:20 PM, Location: Gold Coast

Software developers employ incremental progress to cause radical development break throughs. The same is true in medicine, manufacturing and finance. For example, a new anti-diabetic medicine might not have superior outcomes of improved glycemic control, but it might be less expensive. Or a new device for use in hand surgery might not have superior digital mobility, but might be easier for the surgeon to implant. Or perhaps micro-loans to novice entrepreneurs might not raise the economic output for the county, but might cultivate cooperation among local businesses. These are examples where the outcome of a new method might not be objectively worse, that is non-inferior, but would have some reason to replace the current method and instigate incremental progress. SAS users are often asked to size and design studies to test this kind of non-inferiority. Such a design requires consideration of the frame-work of the hypothesis set-up, the directionality, the determination of the non-inferiority margin and the proper analysis method. In this review, the rationale for these considerations will be presented, common misunderstandings clarified and examples using SAS/STAT® given.

IN-116 : Simulating Skewed Multivariate Distributions Using SAS: Cases of Lomax, Mardia's Pareto (Type I), Logistic, Burr and F Distributions
Zhixin Lun, Oakland University
Ravindra Khattree, Oakland University
Tuesday, 9:00 AM - 9:20 AM, Location: Gold Coast

By using various build-in functions in SAS software, it is easy to generate data from several common multivariate distributions such as multivariate normal (RANDNORMAL function) and multivariate Student's (RANDMVT function). However, functions for generating data from other less common multivariate distributions are not readily available in SAS. We will illustrate the simulation and generation of random numbers from a multivariate Lomax distribution. Importance of the work lies in its wide applicability in reliability theory and many other situations where one needs to use a flexible nonnegative skewed multivariate distribution for modeling. Further, based on various useful properties of multivariate Lomax distribution, Mardia's multivariate Pareto of type I, multivariate Logistic, multivariate Burr, and multivariate random variables can also be readily simulated. We develop and implement a SAS macro using SAS/IML to generate random numbers from all of these multivariate probability distributions.

IN-130 : Real World Evidence and Population Health Analytics: Intersection and Application Outside
David Olaleye, SAS
Monday, 9:00 AM - 9:50 AM, Location: Gold Coast

Outside of clinical trials, real world evidence (RWE) studies provide pragmatic evidence to investigate safety and effectiveness of marketed pharmaceutical products in real-world clinical settings. Leveraging the power of RWE studies to predict patient outcomes and forecast health care utilization have been met with the challenges of big data from disparate data sources. SAS® Real World Evidence and SAS® Visual Analytics enables quick discovery and creation of patient cohorts for population health analytics. Machine learning algorithms informed by industry-standard clinical diagnoses and episode-of-care definitions are used to capture the multi-dimensional nature and health care utilization of patients followed over time. Using publicly available claims data from the 2008-2010 Medicare population from the Centers for Medicare & Medicaid Services (CMS) and from a commercially managed care population, we show how interpretability of RWE studies can be extended with population health analytics to predict at-risk patient subgroups with a propensity to benefit from intensive care coordination and management. Our analyses focus on 1) creating episode-of-care and resource consumption profiles for a type 2 diabetes mellitus (T2DM) population cohort; 2) using unsupervised machine learning algorithms to identify safety events associated with the treatment pathways for the cohort; and 3) using the PSMATCH and CAUSALTRT procedures to obtain a matched patient sample and estimate the potential outcomes effects for the treatment pathways.

Rapid Fire

RF-003 : A Visual Step-by-step Approach to Converting an RTF File to an Excel File
Kirk Paul Lafler, Software Intelligence Corporation
Monday, 3:30 PM - 3:40 PM, Location: Water Tower

Rich Text Format (RTF) files incorporate basic typographical styling and word processing features in a standardized document that many programs and applications are able to read. In today's high-tech arena sometimes the contents of an RTF file needs to be viewed as, and even converted to, an Excel file. You would think that since both RTF and Excel are Microsoft standards that this would be a simple process to achieve, but you may be surprised to find out that is not the case. Learn about several "free" web-based and online applications as well as traditional SAS®-based programming techniques that can be used to convert an RTF file to an Excel file.

RF-004 : Saving and Restoring Startup (Initialized) SAS® System Options
Kirk Paul Lafler, Software Intelligence Corporation
Tuesday, 9:30 AM - 9:40 AM, Location: Water Tower

Processing requirements sometimes require the saving (and restoration) of SAS® System options at strategic points during a program's execution cycle. This paper and presentation illustrates the process of using the OPTIONS, OPTSAVE, and OPTLOAD procedures to perform the following operations: - Display portable and host-specific SAS System options and their settings; - Display restricted SAS System options; - Display SAS System options that can be restricted; - Display information about SAS System option groups; - Display a list of SAS System options that belong to a specific group; - Display a list of SAS System options that can be saved; - Save startup SAS System options; - Restore startup SAS System options, when needed.

RF-017 : 3 ways to get Pretty Excel-Style Tables: PROC REPORT, PROC TABULATE, and Help from SAS Enterprise Guide®
Brooke Ellen Delgoffe, Marshfield Clinic Research Institute
Monday, 9:30 AM - 9:40 AM, Location: Water Tower

In many cases SAS programmers may be asked to provide tabular or summary data in an Excel-Style format (stacked headers, colored headers, bolded total lines, etc.). This paper explores 3 different ways to produce and export excel-style tables into ODS locations or the results window using SAS 9.4 or Enterprise Guide 7.15. Addition of ODS style elements will help readers to apply aesthetically pleasing colors and formatting to their output. The use of Enterprise Guide will help new users to SAS perform these tasks with little to no SAS programming knowledge, while helping more versed SAS programmers utilize pre-written code as a starting point. Exploration of how to use PROC SQL in combination with PROC REPORT to display distinct counts and other uniquely formatted summary statistics will give programmers a succinct way to display summary statistics in the midst of required value duplication. For example, multiple Body Mass Index (BMI) entries for a single patient identifier may be needed to provide average mean BMI per patient per period, but distinct patient counts may still be desired. A series of examples like this one will be used to cover each of the methods. Brief Outline: Introduction PROC SQL with PROC REPORT " Using PROC SQL to obtain and format counts of interest " Using PROC REPORT to display grouped data with stacked headers " Customizing PROC REPORT output with style elements " EXAMPLE: Distinct patient counts (fake patient data provided in data step) PROC TABULATE For Presenting Multiple statistics on the same variable " Creating a grouped table of summary statistics with stacked headers " Exporting to the ODS Excel location " EXAMPLE: Cars data with stacked headers using SASHELP.cars Summary Tables Builder in Enterprise Guide " Creating the same output as the PROC TABULATE example above " Step by step process with screenshots " Formatting values with Enterprise Guide " Filtering data in the Summary Tables Builder SAS Versions -SAS EG 7.15 (HF7)* -SAS 9.4 (TS LEVEL 1M3) *dependency

RF-024 : Navigating the Ins and Outs of Mapping in SAS Enterprise Guide
Ruth Kurtycz, Spectrum Health
Tuesday, 10:30 AM - 10:40 AM, Location: Water Tower

Mapping data can be an extremely valuable visualization tool. Maps are approachable and intuitive, which can be especially useful when trying to reach an audience without an analytical background. This paper will explain how to create zip code and census tract level maps using SAS Enterprise Guide. Using an example of mapping infant mortality data in Kent County, we will go over the basics of obtaining shape files and using PROC MAPINMORT, the %ANNOMAC and %MAPLABEL macros, PROC GPROJECT, PROC GREMOVE, and PROC GMAP. We will also touch on prepping the data for mapping, creating formats, and modifying labels for maps.

RF-029 : Using the XLSX libref engine with metadata available in Dictionary Tables
Michael Harper, A-Line Staffing
Tuesday, 11:30 AM - 11:40 AM, Location: Water Tower

This paper will show simple reference to external Excel workbooks, and how to reference their metadata elements young the Dictionary tables.

RF-033 : The Doctor Ordered a Prescription&Not a Description: Driving Dynamic Data Governance Through Prescriptive Data Dictionaries That Automate Quality Control and Exception Reporting
Troy Hughes, Datmesis Analytics
Tuesday, 11:00 AM - 11:10 AM, Location: Water Tower

Data quality is a critical component of data governance and describes the accuracy, validity, completeness, and consistency of data. Data accuracy can be difficult to assess, as it requires a comparison of data to the real-world constructs being abstracted. But other characteristics of data quality can be readily assessed when provided a clear expectation of data elements, records, fields, tables, and their respective relationships. Data dictionaries represent a common method to enumerate these expectations and help answer the question What should my data look like? Too often, however, data dictionaries are conceptualized as static artifacts that only describe data. This text introduces dynamic data dictionaries that instead prescribe business rules against which SAS® data sets are automatically assessed, and from which dynamic, data-driven, color-coded exception reports are automatically generated. Dynamic data dictionaries-operationalized within Excel workbooks-allow data stewards to set and modify data standards without having to alter the underlying software that interprets and applies business rules. Moreover, this modularity-the extraction of the data model and business rules from the underlying code-flexibly facilitates reuse of this SAS macro-based solution to support endless data quality objectives.

RF-034 : Abstracting and Automating Hierarchical Data Models: Leveraging the SAS® FORMAT Procedure CNTLIN Option To Build Dynamic Formats That Clean, Convert, and Categorize Data
Troy Hughes, Datmesis Analytics
Monday, 1:30 PM - 1:40 PM, Location: Water Tower

The SAS® FORMAT procedure "creates user-specified formats and informats for variables." In other words, FORMAT defines data models that transform (and sometimes bin) prescribed values (or value ranges, in the case of numeric data) into new values. SAS formats facilitate multiple objectives of data governance, including data cleaning, the identification of outliers or new values, entity resolution, and data visualization, and can even be used to query or join lookup tables. SAS formats are often hardcoded into SAS software, but where data models are fluid, formats are best defined within control files outside of software. This modularity-the separation of data models from the programs that utilize them-allows SAS developers to build and maintain SAS software independently while domain subject matter experts (SMEs) separately build and maintain the underlying data models. Independent data models also facilitate master data management (MDM) and software interoperability, allowing a data model to be maintained as a single instance, albeit implemented not only with SAS but also Python, R, or other languages or applications. The CNTLIN option (within the SAS FORMAT procedure) facilitates this modularity by creating SAS formats from data sets. This text introduces the BUILD_FORMAT macro that greatly expands the utility of CNTLIN, allowing it to build formats not only from one-to-one and many-to-one format mappings but also from multitiered, hierarchical data models that are built and maintained externally in XML files. The numerous advantages of BUILD_FORMAT are demonstrated through successive SAS code examples that rely on the taxonomy of the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5).

RF-038 : Evaluate your SCORE: Logistic regression prediction comparison using the SCORE statement
Robert G. Downer, Grand Valley State University
Tuesday, 2:00 PM - 2:10 PM, Location: Water Tower

Evaluate your SCORE: Logistic regression prediction comparison using the SCORE statement The SCORE statement in PROC LOGISTIC was introduced in SAS/STAT 9.0 and it is a feature that can be utilized efficiently to quickly evaluate prediction performance for new observations. Used in conjunction with the OUTMODEL and INMODEL statements, the SCORE statement can be a very beneficial aid in quickly comparing the prediction performance of multiple logistic regression models for the same test or validation observations. The concise syntax of these statements will be illustrated. Performance criterion output such as the misclassification rate will be discussed through a worked example involving multiple models of a binary response. Although some knowledge of logistic regression would be beneficial for full understanding of this paper, it is written for a general audience interested in predictive modeling.

RF-058 : Like, Learn to Love SAS® Like
Louise Hadden, Abt Associates Inc.
Monday, 10:00 AM - 10:10 AM, Location: Water Tower

How do I LIKE SAS®? Let me count the ways.... There are numerous instances where LIKE or LIKE operators can be used in SAS - and all of them are useful. This paper will walk through such uses of LIKE as: using the LIKE condition to perform pattern-matching; searches and joins with that smooth LIKE operator (and the NOT LIKE operator); the SOUNDS LIKE operator; and PROC SQL CREATE TABLE LIKE. We will explore the pros and cons of each LIKE functionality in SAS, and suggest alternatives if LIKE falls short of love.

RF-060 : DOMinate your ODS Output with PROC TEMPLATE, ODS Cascading Style Sheets (CCS), and the ODS Document Object Model (DOM)
Louise Hadden, Abt Associates Inc.
Troy Hughes, Datmesis Analytics
Tuesday, 1:30 PM - 1:50 PM, Location: Water Tower

SAS® practitioners are frequently forced to produce SAS output in mandatory formats, such as using a company logo, corporate or regulated government templates and/or cascading style sheet (CSS). SAS provides several tools to enable the production of customized output. Among these tools are the ODS Document Object Model, cascading style sheets, PROC TEMPLATE, and ODS style overrides (usually applied in procedures and/or in originating data.) This paper and presentation investigates "under the hood" of the Output Delivery System destinations and the PROC REPORT procedure and investigates how mastering ODS TRACE DOM and controlling styles with the CSSSTYLE= option, PROC TEMPLATE, and style overrides can satisfy client requirements and enhance ODS output.

RF-067 : Breaking Human Trafficking Network: An Analytics Approach
Raj Laxmi Prakash, Oklahoma State University
Miriam McGaugh, Oklahoma State University
Tuesday, 2:30 PM - 2:40 PM, Location: Water Tower

Labor migration, illegal sex workers, child trafficking and many more impact our society in many ways. The need for developing a tool to discover and track human trafficking is extremely important and pulls the attention of many researchers [1]. This exploratory paper describes the comprehensive analysis of the role of online classified advertisements in facilitating sex trafficking specifically and explores technological innovations to combat the increasing network of human traffickers. With the growth of internet and social media, human trafficking networks are spreading from the ease of communication. On websites such as Backpage.com, different online advertisements are posted to lure men, women, teens and children [6]. These ads are used for selling as well as recruiting potential victims by manipulations and false promises of a job etc. Traffickers have become more sophisticated in their methods resulting in being seemingly untraceable and hiding their identities. In 2018, Backpage.com was seized by the FBI for their participation in illegal prostitution and sex trafficking. However, this did not end the problem but shifted it to unknown places [6]. To combat this growing problem, this paper uses the power of text and network analytics to build models for identifying different categories of advertisements and potential connected relationships including timing, locations, contact numbers and other features of the ads. The data was obtained by scrapping ads from sites like Backpage.com and analyzed using different SAS tools like SAS Enterprise Guide, SAS Viya and SAS Enterprise Miner for text, network and exploratory analysis of the data.

RF-069 : The Advent of Renewable Energy
Sai Teja Sagi, Oklahoma State University
Miriam McGaugh, Oklahoma State University
Monday, 2:30 PM - 2:40 PM, Location: Water Tower

Renewable energy accounted for 12.2% of total primary energy consumption and 14.9% of the domestically produced electricity in the United States in 2016. The development of renewable energy and energy efficiency marked "a new era of energy exploration" in the United States (US) according to former President Barack Obama. This research focuses on how the usage of renewable and non-renewable energy has changed over the past 25 years in the US and identifies potential correlations in usage patterns. Secondly, the impact of economic and geopolitical factors is investigated. Having a clear picture of energy usage by the world nations can help make effective policies for renewable energy adoption by different world countries. All kinds of audience ranging from students to scientists will get an understanding of the energy usage in the world. The dataset is obtained from the United Nations website and SAS Enterprise Guide was used for data preparation. Statistical methods like time series analysis and trend line studies are used to make the visualizations meaningful. Usage patterns of non-renewable energy are shown for the last 25 years from 1990 to 2014 and compared to renewable energy for the same time-period. We can see the exponential increase in energy production from solar and wind and a decrease in the usage of charcoal and brown coal. The usage comparisons between countries like USA, Germany, and China are made to assess which countries have been influential in fueling this trend. We can see from the analysis that the USA has significant growth rates in solar and wind energy production albeit its overall capacity is lower.

RF-077 : Utilizing Macros to Create Patient Site Matching via Zip-Code Radiuses
Kathryn Schurr, Quest Diagnostics
Monday, 3:00 PM - 3:10 PM, Location: Water Tower

Determining how to match patients to clinical trial sites is a messy problem. This problem includes many different facets not limited to: inclusion criteria, exclusion criteria, and proximity to clinical trial sites. By creating a site matching program in SAS®, users are able to take patient lists and can easily determine whether or not a patient falls within a certain proximity of the site(s) of interest. This macro takes into account the Zip Code® location of the proposed site and the Zip Code location of the patient. It then is able to compute whether or not that patient should be assigned to any particular site. This paper builds upon an already existing macro that calculates the Zip Codes within a certain mile radius of a singular site location.

RF-079 : US Airline Passenger Satisfaction using SAS Enterprise Miner
Harish Reddy Patlolla, Oklahoma State University
Miriam McGaugh, Oklahoma State University
Monday, 2:00 PM - 2:10 PM, Location: Water Tower

In the past 20 years, the aviation industry has grown rapidly. This growth of the industry provides opportunities as well as challenges. While the opportunities arise because of increasing demand, rival airlines pose a threat to long established companies. Apart from optimizing pricing, have you ever wondered what airlines do to overcome these threats? Passenger Satisfaction. Unhappy passengers mean fewer customers and less revenue. Therefore, it is important that passengers have a rich experience every time they travel. A satisfaction survey from 259,760 passengers, which contains a combination of categorical and continuous variables, was used for this study in an attempt to predict customer satisfaction based on variables that are easily obtained for airlines. Predictive models were built using decision trees and logistic regression. This study sought to not only explain the important factors, which impact passenger satisfaction in the US Airline industry, but also to examine change in those factors across different age groups. SAS® Enterprise Miner" and Tableau have been used for predictive modelling and exploratory analysis, respectively. The decision tree model was able to predict customer satisfaction with 86% accuracy indicating the in-flight entertainment, seat comfort and ease of online booking were some of the most important variables.

RF-100 : Surviving Survival Analysis 101: Making the Likelihood Ratio Test Easier Using a Macro
Katelyn Ware, Grand Valley State University
Rachel Baxter, Grand Valley State University
Monday, 10:30 AM - 10:40 AM, Location: Water Tower

The likelihood ratio test is a commonly used hypothesis test to examine if a nested model is a better fit than a full model. In survival analysis, the likelihood ratio test is a useful tool when deciding if interaction terms are needed in a Stratified Cox Proportional Hazards model. When a certain covariate does not meet the Proportional Hazard assumption in survival data, it can still be included in the Cox Proportional Hazard model by stratifying by it. However, one must decide if the slopes are necessarily different across strata or not. The -2LogL values are obtained by running the full and reduced models with PROC LIFEREG and one can then manually calculate the p-value for a chi-square likelihood ratio test, but there is no automated option available. The following paper describes a macro that does automate this test for the user.

RF-105 : Comparing Dates without an Array
Laurie Smith, Cincinnati Children's Hospital Medical Center
Monday, 11:00 AM - 11:10 AM, Location: Water Tower

This macro with help a beginner or better SAS v9.4 user compare a subject's dates in separate observations against each other without using an array. Proc sql along with a do loop allows a user to compare and retain dates that satisfy a defined condition, creating a final dataset with those desired dates.

RF-108 : What Not to Do in a Program Used with %include
Stephanie Thompson, Datamum
Monday, 11:30 AM - 11:40 AM, Location: Water Tower

This Rapid Fire session will provide a fast look at what can make a program you list on a %include statement not work. Based on a frustrating real-world scenario.

RF-109 : 10 Cool Things You Can Do in a DATA STEP
Stephanie Thompson, Datamum
Tuesday, 10:00 AM - 10:10 AM, Location: Water Tower

A look at some interesting things you can do in a DATA STEP that would be harder to do in other procedures or that are just plain interesting. Different aspects of utilizing the PDV, joins, editing a file without bringing it in to SAS, automatic variables, and a bit of what to do with DATA _NULL_ are some of the highlights. See the possibilities in the DATA STEP!

SAS 101 Plus

SP-002 : SAS® Macro Programming Tips and Techniques
Kirk Paul Lafler, Software Intelligence Corporation
Monday, 2:00 PM - 2:50 PM, Location: Comiskey

The SAS® Macro Language is a powerful tool for extending the capabilities of the SAS System. Numerous tips, tricks and programming techniques related to the construction of effective macros are demonstrated. Topics include how to process statements containing macros; replace text strings with macro variables; generate SAS code using macros; manipulate macro variable values with macro functions; interface the macro language with the DATA step and SQL procedure; store and reuse macros; construct macros consisting of positional and keyword parameters; troubleshoot and debug macros; and develop efficient and portable macro language code.

SP-012 : PROC FORMAT with HTML - for useful Drill Down output in Web and/or Excel
Zeke Torres, RedMane Technology
Monday, 10:00 AM - 10:50 AM, Location: Comiskey

This paper and code brings together output from SAS and common SAS features like: Proc Tabulate or Proc Summary, Proc Format, ODS HTML/EXCEL and gives the end user an OLAP like set of reports. This is a simple example that brings together those elements in a useful way for a SAS programmer to share output with someone else who isn't familiar with SAS but requires a way to drill down into the data, report and results to investigate further and obtain more answers.

SP-026 : Logistic Regression, Basics and Beyond
Bruce Lund, Independent Consultant
Monday, 9:00 AM - 9:50 AM, Location: Comiskey

This paper presents light theory, supported by simulations, as well as practical suggestions for developing binary logistic regression models. Topics include: Firth method versus usual maximum likelihood method, screening, binning, transforming predictors, identification of multicollinearity, oversampling for rare events, predictor selection methods using PROC LOGISTIC, HPLOGISTIC, HPGENSELECT, and measures of fit and predictive accuracy. Base SAS and SAS/STAT Immediate User of SAS with some exposure to logistic regression

SP-031 : User-Defined Multithreading with the SAS® DS2 Procedure: Performance Testing DS2 Against Functionally Equivalent DATA Steps
Troy Hughes, Datmesis Analytics
Tuesday, 9:00 AM - 9:50 AM, Location: Comiskey

The Data Step 2 (DS2) procedure represents the first opportunity that developers have had to build custom, multithreaded processes in Base SAS®. Multithreaded processing debuted in SAS 9, when built-in procedures such as SORT, SQL, and MEANS were threaded to reduce runtime. Despite this advancement, and in contrast with languages such as Java and Python, SAS 9 still did not provide developers the ability to create custom, multithreaded processes. This limitation was overcome in SAS 9.4 with the introduction of the DS2 procedure-a threaded, object-oriented version of the DATA step. However, because DS2 relies on methods and packages (neither of which have been previously available in Base SAS), both DS2 instruction and literature has predominantly fixated on these object-oriented aspects rather than DS2 multithreading. This text is the first to focus solely on DS2 multithreading and the performance advantages thereof. Common DATA step tasks such as data cleaning, transformation, and analysis are demonstrated, after which functionally equivalent DS2 code is introduced. Each paired example concludes with performance metrics that inarguably demonstrate faster runtimes with the DS2 language-even on a stand-alone laptop. All examples can be run in Base SAS and do not require in-database processing or the purchase of the DS2 Code Accelerator or other optional SAS components.

SP-043 : Look up not down: Advanced Table Lookup Techinques in BASE SAS
Jayanth Iyengar, Data Systems Consultants LLC
Josh Horstman, Nested Loop Consulting
Tuesday, 11:00 AM - 11:50 AM, Location: Comiskey

One of the most common data manipulation tasks SAS programmers perform is combining tables through table lookups. In the SAS programmer's toolkit many constructs are available for performing table lookups. Traditional methods for performing table lookups include conditional logic, match-merging and SQL joins. In this paper we concentrate on advanced table lookup methods such as formats, multiple SET statements, and HASH objects. We conceptually examine what advantages they provide the SAS programmer over basic methods. We also discuss and assess performance and efficiency considerations through practical examples.

SP-052 : Fifteen Functions to Supercharge Your SAS® Code
Josh Horstman, Nested Loop Consulting
Tuesday, 1:30 PM - 2:20 PM, Location: Comiskey

The number of functions included in SAS® software has exploded in recent versions, but many of the most amazing and useful functions remain relatively unknown. This paper will discuss such functions and provide examples of their use. Both new and experienced SAS programmers should find something new to add to their toolboxes.

SP-053 : Using Macro Variable Lists to Create Dynamic Data-Driven Programs
Josh Horstman, Nested Loop Consulting
Tuesday, 2:30 PM - 2:50 PM, Location: Comiskey

The SAS Macro Facility is an amazing tool for creating dynamic, flexible, reusable programs that can automatically adapt to change. In this paper, you'll see how macro variable lists provide a simple but powerful mechanism for creating data-driven programming logic. Don't hard-code data values into your programs. Eliminate data dependencies forever and let the macro facility write your SAS code for you!

SP-057 : Using ODS Trace (DOM), Procedural Output and ODS Output Objects to Create the Output of Your Dreams
Louise Hadden, Abt Associates Inc.
Monday, 3:00 PM - 3:50 PM, Location: Comiskey

SAS® procedures can convey an enormous amount of information - sometimes more information than is needed. The ODS TRACE and ODS TRACE DOM statements allow us to discover what output objects and underlying style information is created by each invocation of a SAS procedure and procedural options. By manipulating procedural output and ODS output objects, we can pick and choose just the information we want to see and report upon. We can then harness the power of SAS reporting procedures and various ODS destinations to present the information accurately and attractively. This presentation is suitable for all levels of proficiency. Examples shown were run using SAS 9.4 Maintenance Release 5 on a Windows Server platform.

SP-065 : Quick, Call the "FUZZ": Using Fuzzy Logic
Richann Watson, DataRich Consulting
Louise Hadden, Abt Associates Inc.
Monday, 4:00 PM - 4:50 PM, Location: Comiskey

SAS® practitioners are frequently called upon to do a comparison of data between two different data sets and find that the values in synomonous fields do not line up exactly. A second quandary occurs when there is one data source to search for particular values, but those values are contained in character fields in which the values can be represented in myriad different ways. This paper discusses robust, if not warm and fuzzy, techniques for comparing data between and selecting data in SAS data sets in not so ideal conditions.

SP-072 : Powerful SAS® Output Delivery with ODS EXCEL
LeRoy Bessler, Bessler Consulting and Research
Tuesday, 10:00 AM - 10:50 AM, Location: Comiskey

A common destination for results prepared with SAS is often an Excel workbook. Everyone already has Excel and knows how to use it, to reformat or further explore their results however they wish. ODS Excel enables a SAS programmer to create highly formatted reports, tabular or graphic, or a combination of both, that can be opened and used with Excel. You can turn on customization/formatting features in SAS that would be possible manually inside Excel, to deliver an already finished product to the viewer of the report. The ODS Excel capability does not require Excel to be installed on the machine that creates ODS Excel output. You can use ODS EXCEL running SAS on MVS, UNIX, Linux, or Windows. No prior knowledge is assumed. ODS Excel output requires Microsoft Excel 2010 or later.

SP-093 : Urge to Merge? Maybe You Should Update Instead.
Ben Cochran, The Bedford Group, Inc.
Monday, 11:00 AM - 11:50 AM, Location: Comiskey

Many SAS users need the functionality of the UPDATE statement, but they just don't know about the built-in features inherent in this statement, so instead, they try to perform an updating operation with the MERGE statement. There is some very powerful built-in logic incorporated within the UPDATE statement that can make this operation a very simple programming endeavor. This paper explores some of the features of the UPDATE statement and why you would want to use it instead of the MERGE statement.

SP-131 : It's All About the Base - Procedures
Jane Eslinger, SAS
Monday, 1:00 PM - 1:50 PM, Location: Comiskey

As a Base SAS® programmer, you spend your day manipulating data and creating reports. You know there is a procedure that can give you what you want. As a matter of fact, there is probably more than one procedure to accomplish the task. Which one should you use? How do you remember which procedure is best for which task? This paper is all about the Base procedures. It explores the strengths of the commonly used, nongraphing procedures. It discusses the challenges of using each procedure and compares it to other procedures that accomplish similar tasks. The first section of the paper looks at utility procedures that gather and structure data: APPEND, COMPARE, CONTENTS, DATASETS, FORMAT, SORT, SQL, and TRANSPOSE. The next section discusses the Base SAS procedures that work with statistics: FREQ, MEANS/SUMMARY, and UNIVARIATE. The final section provides information about reporting procedures: PRINT, REPORT, and TABULATE.

SAS Super Demo

SD-117 : What's New in the ODS Excel Destination
Chevell Parker, SAS
Monday, 9:00 AM - 9:20 AM, Location: Wrigley

This demo highlights some of the newer features of the ODS destination for Excel, along with reasons to move to this destination if you have not already.

SD-118 : Generating Pivot Tables Using ODS Markup
Chevell Parker, SAS
Monday, 11:00 AM - 11:20 AM, Location: Wrigley

This demo demonstrates how quickly you can generate pivot tables and pivot graphs from your SAS® data. Also demonstrated is the ability to automate this process by creating a task using SAS® Studio to generate pivot tables and graphs.

SD-119 : Important Performance Considerations When Moving SAS® to a Public Cloud
Margaret Crevar, SAS
Monday, 10:30 AM - 10:50 AM, Location: Wrigley

Any hardware infrastructure that is chosen by a SAS(r) customer to run their SAS(r) applications requires the following: a good understanding of all layers and components of the SAS(r) infrastructure an administrator to configure and manage the infrastructure the ability to meet SAS requirements, not to just run the software but to also allow it to perform as optimally as possible This paper will talk about important performance considerations for SAS(r) 9 (both SAS(r) Foundation and SAS(r) Grid Manager) and SAS(r) Viya(r) when hosted in any of the available public clouds (Amazon AWS, Microsoft Azure, and Google Cloud, to name a few). It will also give guidance on how to configure the cloud infrastructure to get the best performance with SAS.

SD-120 : SAS Real World Evidence
David Olaleye, SAS
Monday, 2:00 PM - 2:20 PM, Location: Wrigley

Real world evidence (RWE) studies provide pragmatic evidence to investigate safety and effectiveness of marketed pharmaceutical products in real-world clinical settings. This super demo will demonstrate how to use SAS Real World Evidence for quick discovery, creation and management of patient cohorts, and for generating analysis data sets for comparative effectiveness outcomes research.

SD-121 : Using ODS LAYOUT to Align Text and Graphs in PDF
Jane Eslinger, SAS
Monday, 10:00 AM - 10:20 AM, Location: Wrigley

Need text and graphs on one page? Does the text correspond to a specific graph? ODS LAYOUT statements provide the ability to place output objects where you need them to be in PDF. Learn tips for aligning text output with graphical output with just a few lines of code.

SD-122 : Why Learn CASL as a SAS Programmer?
Jane Eslinger, SAS
Monday, 3:00 PM - 3:20 PM, Location: Wrigley

Get an introduction to CASL, the new language that allows you to drive CAS actions then manipulate the results. It's just like how PROCs drive CAS actions except now you're in charge.

SD-123 : What's New in SAS/STAT 15.1
Amy Shi, SAS
Monday, 9:30 AM - 9:50 AM, Location: Wrigley

This presentation reviews highlights of SAS/STAT 15.1. New development areas include causal graph models for causal analysis (the CAUSALGRAPH procedure), restricted mean survival time regression for time-to-even data (the RMSTREG procedure), and generalized linear mixed-effects models for Bayesian analysis (the BGLIMM procedure). Other notable enhancements include modeling pharmacokinetics with the NLMIXED and MCMC procedures, semiparametric proportional hazards models to interval-censored data with the ICPHREG procedure, and conditional distribution analysis with the QUANTREG procedure.

SD-124 : Sandwich your SAS dataset to Excel Pivot Tables
Charu Shankar, SAS
Monday, 11:30 AM - 11:50 AM, Location: Wrigley

Excel is universally loved. SAS has a way to bring excel into SAS so that you can analyze your data. Users now ask "Great, I can analyze my data in SAS, but my end users don't have SAS on their desktops. How can I give them SAS data in excel form". We'll go even further, instead of taking SAS into a standard Excel workbook, what if you could take SAS to an excel pivot table? Now you can. In this demo watch how quickly you can take a SAS dataset to excel pivot tables. See how in minutes, the Excel table shapes and forms right under your own eyes.

SD-125 : DS2 constructs
Charu Shankar, SAS
Monday, 2:30 PM - 2:50 PM, Location: Wrigley

DS2, a new proprietary language available in SAS 9.4 is a great way to combine 2 languages into one. It uses components of data step as well as SQL type syntax.. in this superdemo Learn the construct of this language and one big reason you may want to consider starting to use it.

SD-126 : Playtime in the hadoop zoo
Charu Shankar, SAS
Monday, 1:30 PM - 1:50 PM, Location: Wrigley

Open, Unstructured & ever changing- the world of open source & big data. Come play with the friendly animals in the SAS Hadoop ZOO. Learn about the components of the zoo like Hive, MapReduce, Pig, Hadoop in this informative demo. We will look at the components and tools in the open source Hadoop ecosystem that are needed for managing storing, managing and acting on data of all shapes, sizes and types. We will also explore SAS foundation ways of integrating with open source Hadoop in this superdemo.

e-Poster

PO-023 : Creating a True LSF Batch Job Submission Capability on SAS EG in a SAS Grid
Derek Grittmann, General Dynamics Federal Civilian Health
Adam Hendricks, General Dynamics Federal Civilian Health
Monday, 9:00 AM - 9:20 AM, Location: Exhibit Area

See attachment

PO-028 : Generating SAS Datasets from ASCII Files Using a Crosswalk
Jose Centeno, NORC at the University of Chicago
Monday, 11:00 AM - 11:20 AM, Location: Exhibit Area

In real-life applications, it is common to have a corresponding crosswalk in order to read raw files into SAS, especially if you're dealing with numerous output files and hundreds of variables with particular formats attached. In many cases, you will find yourself with the task of removing, adding, and/or updating variables which could become challenging or tedious. In this paper we will describe how with the help of a few macros we can reduce this effort as well as greatly decrease the number of lines of code in your main program. This program will be easier to maintain, less error-prone and will be easily deployed for other projects. This paper assumes a basic understanding of SAS data step programming, and a basic understanding of SAS Macros.

PO-030 : Badge in Batch with Honeybadger: Generating Conference Badges with Quick Response (QR) Codes Containing Virtual Contact Cards (vCards) for Automatic Smart Phone Contact List Upload
Troy Hughes, Datmesis Analytics
Monday, 2:00 PM - 2:20 PM, Location: Exhibit Area

Quick Response (QR) codes are widely used to encode information such as uniform record locators (URLs) for websites, flight passenger data on airline tickets, attendee information on concert tickets, or product information that can appear on product packaging. The proliferation of QR codes is due in part to the broad dissemination of smart phones and the accessibility of free QR code scanning applications. With the ease of self-scanning QR codes has come another common QR code usage-the identification conference attendees. Conference badges, emblazoned with an attendee-specific QR code, can communicate attendee contact and other personal information to other conference goers, including organizers, vendors, potential customers or employers, and other attendees. Unfortunately, some conference organizers choose not to include QR codes on conference badges because of the complexity and price involved in producing and including the QR codes. To that end, this text introduces flexible Base SAS® software that overcomes this limitation by dynamically creating attendee QR codes from a data set containing contact and other information. Furthermore, the flexible, data-driven approach creates attendee badges that can be maintained and printed by conference organizers. When a badge QR code is scanned by a fellow conference goer, the attendee's personal information-including name, job title, company, phone number, email address, city, state, website, and biographical statement-is uploaded into a variant call format (VCF) file (or vCard) that can be uploaded automatically into a smart phone's contact list. Attendees are able to select what personal information is contained within their QR code and conference organizers are able to customize and configure badge format and content through an external cascading style sheet (CSS) file that dynamically alters badges without the necessity to modify the underlying code. This end-to-end system offers conference organizers potential cost savings of hundreds of dollars!

PO-045 : Have Your SAS Program and Schedule It Too!
Mario Tejada, NORC at the University of Chicago
Monday, 4:00 PM - 4:20 PM, Location: Exhibit Area

In some SAS environments, it is common to use the Windows Task Scheduler to launch production jobs on a set schedule. In order to create a basic task, one needs to manually open Task Scheduler and enter the desired parameters for the job. This paper will explore the possibility of using SAS to drive the scheduling process. Instead of going into Task Scheduler, the programmer can use SAS to set the parameters of a job and programmatically register that task in Windows using PowerShell. This paper assumes a basic understanding of SAS data step programming, SAS Macros and administrator-level access to run Windows Operating system (PowerShell) commands and create scheduled tasks.

PO-071 : Frequency matching case-control techniques: an epidemiological perspective
Hai Nguyen, UIC-School of Public Health
Tuesday, 11:00 AM - 11:20 AM, Location: Exhibit Area

(please refer my abstract from a Word file from in the Submission File, including a Hierarchy Diagram and a table of results)

PO-082 : Oh, There's No Place Like SAS ODS Graphics for the Holidays!
Ted Conway, Self
Tuesday, 1:30 PM - 1:50 PM, Location: Exhibit Area

Already a SAS ODS Graphics user at work, the author used the (free!) SAS University Edition software he'd recently downloaded and installed on his home laptop to knock out a connect-the-dots Tom Turkey with PROC SGPLOT to commemorate Thanksgiving 2015. And so began an ongoing series of "Fun with SAS ODS Graphics" posts on the SAS Support Communities and Twitter that celebrated major holidays and other events. While creating these admittedly frivolous charts from the comfort of his easy chair, the author learned some useful techniques for creating serious data vizzes, which will be shared in this e-Poster and the accompanying paper.

PO-097 : Exploring Wine Reviews: How Language and Word Use Varies in Wine Reviews
Abigail Zysk, Grand Valley State University
Kylie Springer, Grand Valley State University
Tuesday, 3:00 PM - 3:20 PM, Location: Exhibit Area

With thousands of varieties of wine, wine descriptions are diverse and unique to the individual describing the wine. A dataset that included the wine variety, reviewer, and wine descriptor/flavor words was used to explore the frequency of word use within certain varieties of wine, for individual reviewers, and for the combination of variety and reviewer. By examining the word usage for different varieties of wine for the top reviewers, we saw that the most used wine descriptor words were not exclusive to varieties of wine but dependent on the wine reviewer. Roger Voss, who had the largest amount of reviews, used the word 'rich' when using a wine descriptor word 13.37% of the time. This was reflected across different varieties of wines. When reviewing Bordeaux-Style Red Blends he used the word 'rich' 13.57% of the time when using any wine descriptor words, 13.41% for Chardonnay, and 16.09% for Malbec. When looking at word usage for a single variety of wine, we concluded that the words favored by reviewers could influence the results. When exploring the most used wine descriptor words for Bordeaux-Style Red Blends and Rosé wine the word 'rich' was one of the top words for both wines, which could be due to Roger frequently using the word. Based on these results, when selecting a bottle of wine it would be helpful to look at reviews from a multitude of different wine reviewers to get an accurate description of the wine.

PO-098 : Profiling hospital length of stay using the mode
Anne Cain-Nielsen, University of Michigan
Scott Regenbogen, University of Michigan
Monday, 10:00 AM - 10:20 AM, Location: Exhibit Area

Measures of hospital length of stay (LOS) are often used to compare hospital performance and are frequently used in health services research applications. While often appropriate, common metrics of central tendency used for length of stay profiling, such as the mean or median, can be sensitive to outlying values or may not identify representative patterns of care. In an analysis using national Medicare data, we considered three possible profiling measures: hospital mean, median, and mode LOS. We wished to profile the intended or 'typical' postoperative LOS for beneficiaries who underwent total hip replacement (THR, 231,774 patients in 1,831 hospitals), coronary artery bypass grafting (CABG, 218,940 patients in 1,056 hospitals), or colectomy (189,229 patients in 1,876 hospitals). For all three procedures, mean LOS was the metric most sensitive to outlying values (e.g. longer lengths of stay associated with post-operative complications). For CABG and colectomy, median LOS was also longer than the most typical (mode) postoperative care pathway. We will illustrate how hospital mode length of stay can be easily calculated using SAS 9.4, and demonstrate that the mode can be an appropriate metric for profiling hospital length of stay for certain analytic objectives. This presentation would be relevant for any level of SAS user whose work involves profiling hospital length of stay (e.g. health services research, hospital quality improvement, medicine).