MWSUG 2018 Paper Presentations

Paper presentations are the heart of a SAS users group meeting. MWSUG 2018 will feature dozens of paper presentations organized into several academic sections covering a variety of topics and experience levels.

Note: Content and schedule are subject to change. Last updated 28-Sep-2018.

Business Leadership
Hands-on Workshops
Health Sciences
SAS 101 Plus
SAS 301 Beyond the Basics
SAS Super Demos
Statistics / Advanced Analytics
e-Posters

Business Leadership
Hands-on Workshops
Health Sciences
SAS 101 Plus
SAS 301 Beyond the Basics
SAS Super Demos
Statistics / Advanced Analytics
e-Posters

Business Leadership

Paper No.	Author(s)	Paper Title (click for abstract)
BL-101	Paul Segal	Analytics in the Cloud: Beware of "hidden" costs
BL-103	Paul Segal	Accelerate Your Analytics with SAS® and Teradata Using Disparate Data Sources
BL-104	Peter Eberhardt	What is Leadership?
BL-122	Chuck Kincaid	How to HOW: Hands-on-Workshops Made Easy
BL-127	Troy Hughes	From Readability to Responsible Risk Management: Facilitating the Automatic Identification and Aggregation of Software Technical Debt within an Organization Through Standardized Commenting in SAS® Program Files and SAS Enterprise Guide Project Files
BL-144	Tho Nguyen	Become a Data-Driven Organization with People, Process and Technology
BL-146-SAS	Amy Peters	Comparing SAS® Viya® and SAS® 9.4 Capabilities: A Tale of Two SAS Platform Engines
BL-147	David Corliss	Data For Good as a Community Service Project at Work
BL-159	John Labore & Josh Horstman	Advance Your Career with PROC TM!
BL-28	Brandy Sinco	Computer Karma: Non-Monetary Benefits from Statistical And Information Technology Volunteer Work
BL-3	Kirk Paul Lafler	Differentiate Yourself
BL-43	Chuck Kincaid	How to Succeed in Consulting
BL-50	Nancy Brucken	Customized Project Tracking with SAS and Jira
BL-64	Josh Horstman	So You Want To Be An Independent Consultant
BL-90	Sridevi Loya	Analyzing YouTube comments on gun violence using SAS® Viya and SAS® Text Miner

Hands-on Workshops

Paper No.	Author(s)	Paper Title (click for abstract)
HW-33	Richann Watson & Kriss Harris	Animate Your Data!
HW-55	Jayanth Iyengar	Understanding Administrative Healthcare Data sets using SAS programming tools
HW-65	Josh Horstman	Getting Started with the SGPLOT Procedure
HW-9	Kirk Paul Lafler	A Hands-on Introduction to SAS® Metadata DICTIONARY Tables and SASHELP Views
HW-98	Kent Phelps & Ronda Phelps	The Joinless Join ~ The Impossible Dream Come True Using SAS® Enterprise Guide® and Base SAS® PROC SQL and DATA Step; Expand the Power of SAS® Enterprise Guide® and Base SAS® in New Ways
HW-99	Peter Eberhardt	The Baker Street Irregulars Investigate: Discoveries Using Perl Regular Expressions and SAS®

Health Sciences

Paper No.	Author(s)	Paper Title (click for abstract)
HS-118	Adams Kusi Appiah	Bootstrap Linear Mixed-Effects Models using SAS Procedures
HS-119	Xuelin Li et al.	Automated Transfer of a Sea of SAS® Programs between Data Transfers
HS-132	Troy Hughes	Toward Adoption of Agile Software Development in Clinical Trials
HS-142	Zeqing Lu et al.	Spotfire Clinical visualizations from SAS and R
HS-143	Hillary Graham et al.	AutoPDF : an R Package to Output Vector Graphics
HS-160	Richann Watson	Avoiding Sinkholes: Common Mistakes During ADaM Data Set Implementation
HS-27	Brandy Sinco et al.	Tips, Tricks, and Traps on Longitudinal Data Analysis with Discrete and Continuous Times
HS-36	Michael Battaglia et al.	How to Navigate in a Maze of the Raking Macro with Advanced Weight Trimming
HS-38	Michael G. Wilson	Assessing Model Adequacy in Proportional Hazards Regression
HS-39	Xiaoting Wu et al.	Using SAS® to Validate Clinical Prediction Models
HS-45	Xi Chen et al.	Pan-Cancer Epigenetic Biomarker Selection from Blood Sample Using SAS®
HS-53	Jamie Kammer et al.	Creating suicide attempt/intentional self-harm episodes using administrative billing data
HS-71	Roderick Jones & Lynn (Xiaohong) Liu	Leveraging SHEWHART Procedure Options to Monitor and Evaluate Improvements in Healthcare
HS-73	Laurie Smith	A Macro to Import Subject Data Saved in a Location with Separate Subfolders for each Subject
HS-74	Lynn (Xiaohong) Liu & Roderick Jones	Use of SAS Macros to automate the Production of Statistical Process Control Charts
HS-83	Brian Mosier et al.	A Macro to Calculate Sample Size for Studies Using the Proportional Time Assumption
HS-88	Jennifer Scodes	Baseline Mean Centering for Analysis of Covariance (ANCOVA) Method of Randomized Controlled Trial Data Analysis
HS-94	David Corliss	Genocide Modeling - Historical Risk Factors and Odds Ratios

SAS 101 Plus

Paper No.	Author(s)	Paper Title (click for abstract)
SP-100	Peter Eberhardt	Using SASv9.cfg, autoexec.sas, SAS® Registry, and Options to Set Up Base SAS®
SP-116	Louise Hadden	Order, Order! Four Ways to Reorder Your Variables, Ranked by Elegance and Efficiency
SP-139-SAS	Warren Kuhfeld	Keeping Up to Date with ODS Graphics
SP-2	Kirk Paul Lafler	Making Your SAS® Output, Results, Reports, Charts and Spreadsheets More Meaningful with Color
SP-48	John Schmitz	Using Multilabel Formats with PROC SUMMARY to Generate Report Data with Overlapping Time Segments
SP-49	Jack Shoemaker	Data-driven Data Analysis
SP-54	Robert G. Downer	Using your FREQ effectively: Displays to Decipher Proportional Odds in Ordinal Regression
SP-57	Ting Sa	A Macro that Can Get the Geo Coding Information from the Google Map API
SP-61	Donna Levy & Nancy Brucken	Seeing the Forest for the Trees: Part Deux of Defensive Coding by Example
SP-62	Veronica Renauldo	Efficiency Programming with Macro Variable Arrays
SP-63	Josh Horstman	Dating for SAS Programmers
SP-66	Josh Horstman	Merge with Caution: How to Avoid Common Problems when Combining SAS Datasets
SP-69	Larry Riggen	What's the Difference? Using the PROC COMPARE to find out.
SP-75	Margaret Kline & Daniel Muzyka	From Clicking to Coding: Using ODS Graphics Designer as a Tool to Learn Graph Template Language
SP-76	Jayanth Iyengar	Tips, Traps, and Techniques in BASE SAS for vertically combining SAS data sets
SP-78	Jacob Keeley & Carl Nord	Improving Plots Using XAXISTABLE and YAXISTABLE

SAS 301 Beyond the Basics

Paper No.	Author(s)	Paper Title (click for abstract)
SB-10	Kirk Paul Lafler & Stephen Sloan	A Quick Look at Fuzzy Matching Programming Techniques Using SAS® Software
SB-102	Paul Segal	Speed up your Data Processing with SAS Code Accelerator.
SB-114	Louise Hadden	Wow! You Did That Map With SAS®?! Round II
SB-140-SAS	Jane Eslinger	Square Peg, Square Hole-Getting Tables to Fit on Slides in the ODS Destination for PowerPoint
SB-141-SAS	Warren Kuhfeld	Advanced ODS Graphics Examples
SB-145	Kaushal Chaudhary & Dhruba Ghimire	Perl Regular Expression - The Power to Know the PERL in Your Data
SB-19	Kirk Paul Lafler	Visual Storytelling - The Art of Communicating Information with Graphics
SB-34	Barbara Okerson	Backsplash patterns for your world: A look at SAS OpenStreetMap (OSM) tile servers
SB-40	Yurong Dai & Jiangang Jameson Cai	Conversion of CDISC specifications to CDISC data - specifications driven SAS programming for CDISC data mapping
SB-52	John Schmitz	Show Me That? Using SAS VIYA, Visual Analytics and Free ESRI Maps to Show Geographic Data
SB-59	Mark Keintz	Finding National Best Bid and Best Offer - Quote by Quote
SB-60	Mark Keintz	From Stocks to Flows: Using SAS® HASH objects for FIFO, LIFO, and other FO's
SB-89	Manideep Mellachervu & Anvesh Reddy Minukuri	Analyzing Amazon's Customer Reviews using SAS® Text Miner for Devising Successful Product Launch Strategies
SB-93	Deanna Schreiber-Gregory & Karlen Bader	Quality Control for Big Data: How to Utilize High Performance Binning Techniques

SAS Super Demos

Paper No.	Author(s)	Paper Title (click for abstract)
SD-149-SAS	Danny Modlin	Creating a Custom Task in SAS Studio
SD-150-SAS	Brett Wujek	Executing Open Source Code in Machine Learning Pipelines of SAS Visual Data Mining and Machine Learning
SD-151-SAS	Brett Wujek	Tune In to Model Tuning
SD-152-SAS	Warren Kuhfeld	Highly Customized Graphs Using ODS Graphics
SD-153-SAS	Warren Kuhfeld	Heat Maps: Graphically Displaying Big Data and Small Tables
SD-154-SAS	Jane Eslinger	What's New in the ODS Excel Destination
SD-155-SAS	Jane Eslinger	Creating Pivot tables using ODS Markup
SD-156-SAS	Cynthia Zender	SAS 9.4 ODS in a Nutshell
SD-157-SAS	Cynthia Zender	Accessibility with ODS Output
SD-158-SAS	Amy Peters	The Future of SAS Enterprise Guide and SAS Studio

Statistics / Advanced Analytics

Paper No.	Author(s)	Paper Title (click for abstract)
AA-109	Palash Sharma	Application of heavy-tailed distributions using PROC IML, NLMIXED and SEVERITY
AA-117	Yuting Tian	An Introduction to the process of improving a neural network
AA-120	Min Chen	Handling Missing Data in Exploratory Factor Analysis Using SAS
AA-121	Scott Koval	How to Score Big with SAS Solutions: Various Ways to Score New Data with Trained Models
AA-137-SAS	Danny Modlin	Getting Started with Bayesian Analytics
AA-138-SAS	Brett Wujek	Introduction to Machine Learning in SAS
AA-29	Matthew Bates	Automatic Indicators for Dummies: A macro for generating dummy indicators from category type variables
AA-30	Michael Grierson	Confounded? This example shows how to use SAS chi-square tests, correlations and logistic regression to unconfound a result.
AA-31	Bruce Lund	Screening, Transforming, and Fitting Predictors for Cumulative Logit Model
AA-41	Peter Flom	Alternative methods of regression when OLS is not right.
AA-42	Peter Flom	An introduction to classification and regression trees with PROC HPSPLIT.
AA-47	Bruce Lund	Propensity Scores and Causal Inference for (and by) a Beginner
AA-91	Deanna Schreiber-Gregory & Karlen Bader	Logistic and Linear Regression Assumptions: Violation Recognition and Control
AA-92	Deanna Schreiber-Gregory & Karlen Bader	Regularization Techniques for Multicollinearity: Lasso, Ridge, and Elastic Nets

e-Posters

Paper No.	Author(s)	Paper Title (click for abstract)
PO-111	Guangtao Gao	How to Avoid Possible Tricks When Using DATA STEP MERGE Instead of PROC SQL JOIN
PO-115	Louise Hadden	Purrfectly Fabulous Feline Functions
PO-136	Rishabh Mishra	Addressing Opioid Crisis using Data Science
PO-32	Richann Watson & Kriss Harris	Great Time to Learn GTL
PO-44	Venkateswarlu Toluchuri	Self-service utility to List and Terminate SAS grid jobs
PO-51	Nancy Brucken & Jared Slain	An Update on the CS Standard Analyses and Code Sharing Working Group
PO-68	Parag Vilas Sasturkar	Factors Responsible for Students' Enrollment at Oklahoma State University

Abstracts

Business Leadership

BL-101 : Analytics in the Cloud: Beware of "hidden" costs
Paul Segal, Teradata
Tuesday, 9:00 AM - 9:50 AM, Location: Room 209

With the rush to the cloud, it is easy for the data scientist to be unaware of some of the "hidden" costs of using a cloud based infrastructure. These hidden costs can cause a large increase the the operational expenditure that gets billed to your department. In this talk we show you what those costs are, how easy it is to incur them, and how to avoid (or at least mitigate) them, using SAS in-database processing for Teradata. This talk will also include a live demonstration of the techniques outlined.

BL-103 : Accelerate Your Analytics with SAS® and Teradata Using Disparate Data Sources
Paul Segal, Teradata
Monday, 3:30 PM - 3:50 PM, Location: Room 209

Analytics today often involves working with multiple data types from multiple storage types, including traditional relational database management systems (RDBMSs) such as Teradata, Oracle, DB2, Microsoft SQL Server, and MySql, as well as file-system-type storage such as Apache Hadoop and Amazon Simple Storage Service (Amazon S3), as well as NoSQL sources such as MongoDB and Cassandra. Sourcing the data from a federated data layer brings its own share of issues, such as having to know all the details for every data platform (IP address, port numbers, logon details, data access mechanism, data query languages, and so on). Other drawbacks of having a federated data space are that often the data needs to be replicated and stored (using up valuable disc space), and you might no longer be able to leverage processes to speed up your analytics (such as in-database or on-platform processing). In this presentation, we present a solution that addresses all these issues. Teradata QueryGrid combines the most comprehensive in-database solution from SAS with the Teradata RDBMS. With Teradata QueryGrid, you can access data from a wide variety of data sources using a common language (SQL), abstracting away the connection details so that you don't need to know the gritty connection details, all while using the tremendous performance of SAS® running inside the Teradata database

BL-104 : What is Leadership?
Peter Eberhardt, Fernwood Consulting Group Inc
Monday, 4:00 PM - 4:20 PM, Location: Room 209

In this presentation we will talk about the nature of leadership; that is, what it takes to be a leader. We will also look at the difference between being a manager and being a leader. The discussion is industry agnostic. It is applicable to all levels of SAS users, but those just starting out on their careers will find it thought provoking.

BL-122 : How to HOW: Hands-on-Workshops Made Easy
Chuck Kincaid, Experis Business Analytics
Monday, 4:30 PM - 4:50 PM, Location: Room 209

Have you ever attended a Hands-on-Workshop and found it useful? Many people do! Being able to actually try out the things that you're learning is a wonderful way to learn. It's also a great way to teach. You can see if they people can apply what they're learning. Have you ever thought that it would be fun to teach other people in a hands on format? Maybe you weren't sure what it takes or how to approach the course. This presentation will help you with those questions and struggles. What to teach? How much to teach? How should I teach it? How is a Hands-on-Workshop different than lecture style? How much to put into PowerPoints? What if they ask me something I don't know? What if they have a computer problem? All those questions that you have will be answered in this presentation.

BL-127 : From Readability to Responsible Risk Management: Facilitating the Automatic Identification and Aggregation of Software Technical Debt within an Organization Through Standardized Commenting in SAS® Program Files and SAS Enterprise Guide Project Files
Troy Hughes, Datmesis Analytics
Tuesday, 2:00 PM - 2:20 PM, Location: Room 209

Software readability is greatly improved when programs include descriptive comments in a predictable, standardized format. Program headers that describe software requirements, author, creation date, versioning history, caveats, and other metadata are a common method to facilitate a greater understanding of software objectives, strengths, weaknesses, and prerequisites. Moreover, when program headers are standardized, they are not only more readable to developers but also to parsing algorithms that automatically extract metadata for analysis or archival. In addition to those included in program headers, comments throughout software can be parsed and extracted when constructed in a standardized format. This text introduces a standardized commenting methodology that enables both qualitative and quantitative comments to be parsed from SAS® software headers and body. A configuration file defines comment formatting and content and provides a flexible, scalable, reusable SAS macro-based solution. This text demonstrates one use case of this methodology in which software technical debt and risk are assessed via both qualitative (e.g., risk description, proposed risk resolution) and quantitative (e.g., risk severity, risk probability, likelihood of risk discovery, ease of risk mitigation) metadata and metrics included within SAS comments. The comment interpreter dynamically identifies and parses all SAS program files and SAS Enterprise Guide project files (including embedded SAS programs therein) within one or more folders to produce a comprehensive risk register for unlimited programs. This data-driven documentation, generated with push-button simplicity, enables SAS practitioners to better understand and make decisions about project and program risk and technical debt.

BL-144 : Become a Data-Driven Organization with People, Process and Technology
Tho Nguyen, Teradata
Tuesday, 11:00 AM - 11:50 AM, Location: Room 209

Data is a differentiator and an asset to make decisions. As an industry, we are data rich but knowledge poor because organizations are unable to make sense of all the data they collect. We are barely scratching surface when it comes to analyzing all of the data that we have. In addition, the ability to analyze the data has become much more complex and time consuming and companies may not have the right people, process or technology to do the job effectively and efficiently. As data volumes continue to grow, it is imperative to have the proper people, process and technology to become a data-driven organization.

BL-146-SAS : Comparing SAS® Viya® and SAS® 9.4 Capabilities: A Tale of Two SAS Platform Engines
Amy Peters, SAS
Tuesday, 10:00 AM - 10:50 AM, Location: Room 209

SAS® Viya® extends the SAS® Platform in a number of ways and has opened the door for new SAS® software to take advantage of its capabilities. SAS® 9.4 continues to be a foundational component of the SAS Platform, not only providing the backbone for a product suite that has matured over the last forty years, but also delivering direct interoperability with the next generation analytics engine of SAS Viya. Learn about the core capabilities shared between SAS Viya and SAS 9.4, and about where they are unique. See how the capabilities complement each other in a common environment, and understand when it makes sense to choose between the two and when it makes sense to go with both. In addition to these core capabilities, see how the various SAS software product lines stack up in both, including analytics, visualization, and data management. Some products, like SAS(r) Visual Analytics, have one version aligned with SAS Viya and a different version with SAS 9.4. Other products, like SAS® Econometrics, leverage the in-memory, distributed processing of SAS Viya, while at the same time including SAS 9.4 functionality like Base SAS® and SAS/ETS® software. Still other products target one engine or the other. Learn which products are available on each, and see functional comparisons between the two. In general, gain a better understanding of the similarities and differences between these two engines behind the SAS Platform, and the ways in which products leverage them.

BL-147 : Data For Good as a Community Service Project at Work
David Corliss, Peace-Work
Tuesday, 2:30 PM - 3:20 PM, Location: Room 209

Many businesses large and small support volunteering within the community, often sponsoring volunteer work days where employees can attend a well-organized activity for a group doing good work in the community. Today, Data For Good volunteering an opportunity to supply a critical skill for charitable organizations making a difference. This presentation is designed to help business leaders set up a Data For Good project as community service and team building activity, with best practices for finding a good organization to support, recruiting participants, managing the volunteer work day, and sharing the story with the wider community. Descriptions of successful projects and how to do them include building a membership database, data dive events to develop recruiting models for community service groups, seasonal optimization of resources (e.g., food pantries), and others. Practical, proven processes are presented to make Data For Good your company's next community service project.

BL-159 : Advance Your Career with PROC TM!
John Labore, Consultant
Josh Horstman, Nested Loop Consulting
Tuesday, 1:30 PM - 1:50 PM, Location: Room 209

Most of us would like to advance our career in one way or another. You may wish to become a respected technical expert in your field, move up into a leadership role, or even go independent as a consultant or small business owner. Regardless of your career aspirations, leadership and communication skills are critical to your success. Toastmasters International provides an avenue to develop these skills in a constructive and supportive environment. This paper will provide a summary of the Toastmasters program based on the authors' combined 20 years of experience with the program. We'll discuss how you can execute "PROC TM" and the benefits it can bring to your career.

BL-28 : Computer Karma: Non-Monetary Benefits from Statistical And Information Technology Volunteer Work
Brandy Sinco, University of Michigan
Monday, 2:30 PM - 2:50 PM, Location: Room 209

Volunteer statistical and information technology work offers many rewards other than direct monetary payment. Many of these rewards are at least as valuable as money. First, the volunteer can enhance one's skills, which will be useful in a present or future job. Examples with SAS range from simple techniques, such as computing intraclass correlation coefficients with Proc Mixed to complex analysis methods, such as learning how to bootstrap indirect effects with Proc CALIS. Other non-SAS examples are learning how to debug error messages about audio links in web pages and expanding knowledge of computer trouble shooting. In order to be successful as a data scientist, analysts must be continually willing to expand their skills through continuing education. Volunteer work enhances and supplements formal classes and workshops, by providing learning opportunities without the rigid deadlines in a paid job. Second, the volunteer can network with people who may later assist in finding a job. Real life examples will be provided of people who found data analysis and information technology jobs, by being involved in volunteer projects. Third, the volunteer may connect with people who can assist with over-employment by serving as referrals for unwanted overtime projects. Again, real life stories will be illustrated.

BL-3 : Differentiate Yourself
Kirk Paul Lafler, Software Intelligence Corporation
Monday, 1:30 PM - 2:20 PM, Location: Room 209

Today's job and employment marketplace is highly competitive. As a result, SAS® professionals should do everything they can to differentiate and prepare themselves for the global marketplace by acquiring and enhancing their technical and soft skills. Topics include describing how SAS professionals should assess and enhance their existing skills using an assortment of valuable, and "free", SAS-related content; become involved, volunteer and speak at in-house, local, regional and international SAS user group meetings and conferences; and publish blog posts, videos, articles, and PDF "white" papers to differentiate themselves from the competition.

BL-43 : How to Succeed in Consulting
Chuck Kincaid, Experis Business Analytics
Monday, 9:00 AM - 9:50 AM, Location: Room 209

Maybe you are an awesome programmer working on a company's internal consulting team, but you have a hard time getting work done by the deadline. Maybe you are a strong SAS developer who does independent consulting, but clients get upset when changes they ask for cost them more money. Just because you're good in the technical skills doesn't mean that you can succeed as a consultant. This presentation will give you tips on how to do just that. With a combination of project management and consulting skills you can go much farther, whether it's doing internal consulting, independent consulting or working for a consulting company. This presentation will be good for people who want to do better at managing the world outside of their code.

BL-50 : Customized Project Tracking with SAS and Jira
Nancy Brucken, InVentiv Health Clinical
Monday, 10:00 AM - 10:20 AM, Location: Room 209

As programmers and statisticians, most of us are far better at programming tasks than we are at project management. Jira is a powerful and inexpensive commercially-available application designed to help programming teams track their progress on projects. It is built on top of a PostgreSQL database, which can be queried from SAS using SAS/ACCESS to ODBC to generate a variety of custom reports.

BL-64 : So You Want To Be An Independent Consultant
Josh Horstman, Nested Loop Consulting
Monday, 10:30 AM - 11:20 AM, Location: Room 209

While many statisticians and programmers are content with a traditional employment setting, others yearn for the freedom and flexibility that come with being an independent consultant. While this can be a tremendous benefit, there are many details to consider. This paper will provide an overview of consulting as a statistician or programmer. We'll discuss the advantages and disadvantages of consulting, getting started, finding work, operating your business, and various legal, financial, and logistical issues.

BL-90 : Analyzing YouTube comments on gun violence using SAS® Viya and SAS® Text Miner
Sridevi Loya, Student
Monday, 11:30 AM - 11:50 AM, Location: Room 209

Recently the world witnessed a tragic incident involving multiple victims of firearm-related violence. On February 14, 2018, a gunman opened fire at Marjory Stoneman Douglas High School in Parkland, Florida, killing seventeen students and staff members and injuring seventeen others. Gun violence is a complex issue and it is spread throughout the country. No other developed nation comes close to the rate of US gun violence. Americans own an estimated 265 million guns, more than one gun for every adult. Data from the Gun Violence Archive reveals there is a mass shooting - defined as four or more people shot in one incident, not including the shooter - nine out of every ten days on average. Every citizen of this country is endangered by this critical issue and it is important to provide more safety to the citizens and mitigate the threat caused by gun violence and mass shootings. This paper focuses on providing a deeper understanding of public beliefs on such incidents and identifies any viable recommendations given by the public to mitigate these incidents.

Hands-on Workshops

HW-33 : Animate Your Data!
Richann Watson, DataRich Consulting
Kriss Harris, SAS Specialists Ltd
Monday, 1:30 PM - 2:50 PM, Location: Room 208

When reporting your safety data, do you ever feel sorry for the person who has to read all the laboratory listings and summaries? Or have you ever wondered if there is a better way to visualize safety data? Let's use animation to help the reviewer and to reveal patterns in your safety data, or in any data! This hands-on workshop demonstrates how you can use animation in SAS® 9.4 to report your safety data, using techniques such as visualizing a patient's laboratory results, vital sign results, and electrocardiogram results and seeing how those safety results change over time. In addition, you will learn how to animate adverse events over time, and how to show the relationships between adverse events and laboratory results using animation. You will also learn how to use the EXPAND procedure to ensure that your animations are smooth. Animating your data will bring your data to life and help improve lives!

HW-55 : Understanding Administrative Healthcare Data sets using SAS programming tools
Jayanth Iyengar, Data Systems Consultants LLC
Monday, 3:30 PM - 5:05 PM, Location: Room 208

Changes in the healthcare industry have highlighted the importance of healthcare data. The volume of healthcare data collected by healthcare institutions, such as providers, and insurance companies is massive, and growing exponentially. SAS programmers need to understand the nuances and complexities of healthcare data structures to perform their responsibilities. There are various types and sources of Administrative Healthcare data, which include Healthcare Claims (Medicare, Commercial Insurance, & Pharmacy), Hospital Inpatient, and Hospital Outpatient. This training seminar will give attendees an overview and detailed explanation of the different types of healthcare data, and the SAS programming constructs to work with them. The workshop will engage attendees with a series of SAS exercises involving healthcare datasets.

HW-65 : Getting Started with the SGPLOT Procedure
Josh Horstman, Nested Loop Consulting
Tuesday, 8:00 AM - 9:35 AM, Location: Room 208

Do you want to create highly-customizable, publication-ready graphics in just minutes using SAS? This workshop will introduce the SGPLOT procedure, which is part of the ODS Statistical Graphics package included in Base SAS. Starting with the basic building blocks, you'll be constructing basic plots and charts in no time. We'll work through several different plot types and learn some simple ways to customize each one.

HW-9 : A Hands-on Introduction to SAS® Metadata DICTIONARY Tables and SASHELP Views
Kirk Paul Lafler, Software Intelligence Corporation
Monday, 10:00 AM - 11:50 AM, Location: Room 208

SAS® users can easily and quickly access metadata content with a number of read-only SAS data sets called DICTIONARY tables or their counterparts, SASHELP views. During a SAS session, information (known as metadata) is captured including SAS system options along with their default values, assigned librefs, table names, column names and attributes, formats, indexes, and more. This hands-on workshop introduces how metadata can be used as input into a SAS code generator or a SAS macro to produce the desired results, the application of specific DICTIONARY table and SASHELP view content, and an assortment of examples related to the creation of dynamic code.

HW-98 : The Joinless Join ~ The Impossible Dream Come True Using SAS® Enterprise Guide® and Base SAS® PROC SQL and DATA Step; Expand the Power of SAS® Enterprise Guide® and Base SAS® in New Ways
Kent Phelps, Illuminator Coaching, Inc.
Ronda Phelps, Illuminator Coaching, Inc.
Tuesday, 1:30 PM - 3:20 PM, Location: Room 208

SAS Enterprise Guide and Base SAS can easily combine data from tables or data sets by using a PROC SQL Join to match on like columns or by using a DATA Step Merge to match on the same variable name. However, what do you do when tables or data sets do not contain like columns or the same variable name and a Join or Merge cannot be used? We invite you to attend our exciting Joinless Join Hands-On Workshop where we will empower you to expand the power of SAS Enterprise Guide and Base SAS in new ways by creatively overcoming the limits of a standard Join or Merge. You will learn how to design a Joinless Join based upon dependencies, indirect relationships, or no relationships at all between the tables or data sets using SAS Enterprise Guide and Base SAS PROC SQL and DATA Step. In addition, we will highlight how to use a Joinless Join to prepare unrelated joinless data to be utilized by ODS and PROC REPORT in creating a PDF. Come experience the power and versatility of the Joinless Join to greatly expand your data transformation and analysis toolkit.

HW-99 : The Baker Street Irregulars Investigate: Discoveries Using Perl Regular Expressions and SAS®
Peter Eberhardt, Fernwood Consulting Group Inc
Tuesday, 10:00 AM - 11:35 AM, Location: Room 208

A true detective needs the help of a small army of assistants to track down and apprehend the bad guys. Likewise, a good SAS® programmer will use a small army of functions to find and fix bad data. In this paper we will show how the small army of regular expressions in SAS can help you.

Health Sciences

HS-118 : Bootstrap Linear Mixed-Effects Models using SAS Procedures
Adams Kusi Appiah, University of Nebraska Medical Center
Tuesday, 11:00 AM - 11:20 AM, Location: Room 204

The bootstrap resampling technique is a general method for estimating the sampling distribution of a statistic of interest and is applied in many research applications. It can obtain more robust parameter estimates and confidence intervals in situations where no assumptions about the underlying distribution of the model are available. However, the main concern of the bootstrap method is how to generate a bootstrap distribution to resemble the true distribution of the observed data. In the context of linear mixed effects models, the distribution of samples should be generated to account for between-subject variability and residual variability in the data. The bootstrap method will be applied with various ways to resample data for linear mixed effects models using the SURVEYSELECT and MIXED procedures available with SAS®/STAT software. These methods will be applied to assess the uncertainty of parameter estimates in linear mixed effect models with data from the National Cooperative Gallstone Study (NCGS). The results by maximum likelihood (ML), restricted maximum likelihood (REML), and the bootstrap methods are compared. The parametric, semiparametric, and non-parametric bootstrap methods for generating samples and estimating the parameters are discussed.

HS-119 : Automated Transfer of a Sea of SAS® Programs between Data Transfers
Xuelin Li, Eli Lilly and Company
Jameson Cai, Eli Lilly and Company
Cindy Lee, Eli Lilly and Company
Tuesday, 2:00 PM - 2:20 PM, Location: Room 204

In pharmaceutical industries, huge number of SAS programs are rerun routinely to refresh SDTM/ADaM data sets and TFLs (Tables, Figures, Listings) in different folder locations to accommodate data transfers. Manually updating the paths within the programs can be tedious. In this paper, we developed a method to move the SAS programs to the new location with automated path update within the programs. In our approach, we first use the PIPE engine in the FILENAME statement to collect file names of all the SAS programs in a location and create a macro variable for the list of file names. Secondly, we use the INFILE statement to create a temporary dataset by reading each SAS program. Thirdly, from each temporary dataset we use the FILE statement to write a new SAS program in the new location with updated path name. This method facilitates the process of moving SAS programs from one location to another location seamlessly and save hours of programmers' time in updating every program. More importantly, this automation process eliminates the chance for human errors. The work involved in this abstract can be done by using SAS version 9. Audience of this presentation are expected to have advanced SAS skills.

HS-132 : Toward Adoption of Agile Software Development in Clinical Trials
Troy Hughes, Datmesis Analytics
Tuesday, 11:30 AM - 11:50 AM, Location: Room 204

Agile methodologies for software development, including the Scrum framework, have grown in use and popularity since the 2001 Manifesto for Agile Software Development. More than having obtained ubiquity, Agile demonstrably has defined software development in the 21st century with its core foci in collaboration, value-added software, and flexibility gained through incremental development and delivery. Although Agile principles can easily be extrapolated to other disciplines, Agile-related nomenclature, literature, application, and employment descriptions often remain focused on software development alone. In SAS® data analytic development-which often typifies software development that occurs within the clinical trials/pharmaceutical industry-developers and other practitioners also build complex, enduring software and data infrastructure, but often for their own or their team's use, and usually with the intent of transforming data into knowledge and data-driven decisions. And, because these outcomes are more highly valued than their underlying code, clinical trials organizations are more likely to focus on data quality than code quality. Thus, because software development methodologies such as Agile focus on programming and the programming environment and process, they can be overlooked by clinical trials organizations who view software development as a tool, not a product or outcome. Notwithstanding, every tool deserves to be wielded effectively and, to that end, this text introduces Agile development for use in clinical trials organizations. Moreover, the paucity of reference to Agile or any software development life cycle (SDLC) or development methodology within clinical trials is demonstrated through examination of SAS® user-published white papers, SAS® Institute books, and clinical trials employment postings.

HS-142 : Spotfire Clinical visualizations from SAS and R
Zeqing Lu, Eli LIlly
Hillary Graham, Eli Lilly
Jessica Chen, Eli Lilly
Monday, 11:30 AM - 11:50 AM, Location: Room 204

Data visualization is an innovative way to help our business decision-making process more intuitive and efficient. With this emphasis in mind, our project focuses on the interactive capabilities of the popular visualization tools: Spotfire, SAS and R. Spotfire is a very powerful data analytics tool. Not only does Spotfire enable us to create various analyses using its robust visualization layout, it also is surprisingly user-friendly. Its intuitive user interface, potent visualization prowess, and quick subgroup filters make it an indispensable tool to the optimization of drug development. In the past, we could only present screenshots on the Kaplan Meier and forest plots in data review meetings. Such statistical tests and visualizations that Spotfire does not currently support can easily be done in SAS and R. SAS datasets can be directly imported to Spotfire and linked with safety/efficacy datasets. Then we can create visualizations based on the SAS generated datasets. In addition, Spotfire is equipped with TERR, Tibco Enterprise Runtime for R, which allows Spotfire to access and run the R application within its own interface. R enabled us to compute the relevant statistical tests to display alongside the visualizations. Using these features, we were able to create on-demand interactive visualizations to present subgroup efficacy analyses. Exploratory analyses can be viewed before they are formalized, therefore minimal adhoc TFLs are needed which is cost saving. In addition, the efficacy templates can improve team engagement during data review meetings that leads to faster business decision.

HS-143 : AutoPDF : an R Package to Output Vector Graphics
Hillary Graham, Eli Lilly
Zeqing Lu, Eli LIlly
Michelle Carlsen, Eli Lilly
Monday, 3:30 PM - 3:50 PM, Location: Room 204

Many statistical analysts have been outputting graphics in RTF format. In pharmaceutical companies, RTF files are difficult to alter for publication or submission because they are not an editable file type. On the other hand, vector-based files enable users to modify colors, sizes, labels, etc. in Adobe Illustrator before disclosure. Currently, R has the capability to create vector graphics in PDF format. However, the process of outputting graphics in PDF format from R requires several steps. This can be irritating for experienced users, and discouraging for new users. In order to make the process easier, we have created an R package to output graphics into a PDF. This will automate the creation of vector-based graphics in R, making it more efficient for both new and experienced R-users to prepare plots ready for disclosure.

HS-160 : Avoiding Sinkholes: Common Mistakes During ADaM Data Set Implementation
Richann Watson, DataRich Consulting
Tuesday, 2:30 PM - 3:20 PM, Location: Room 204

Avoiding ADaM Sinkholes paper

HS-27 : Tips, Tricks, and Traps on Longitudinal Data Analysis with Discrete and Continuous Times
Brandy Sinco, University of Michigan
Edie Kieffer, University of Michigan
Michael Spencer, University of Washington
Gray Ficker, CHASS Center
Gretchen Piatt, University of Michigan
Monday, 9:00 AM - 9:50 AM, Location: Room 204

When longitudinal data are collected at discrete time points, such as at baseline, 6 and 12 months, compared to continuous times, both exploratory data analysis and linear mixed models need to be modified. For data at discrete times, analysts can use Proc Corr to examine the correlation matrix by simply listing the variable names at each timepoint. In contrast, long datasets with continuous times must be transposed to a format that can be used with Proc Corr, by using the first and last functions. This presentation includes tips and tricks for viewing the empirical correlation structure when time is continuous. When using Proc Mixed for a linear mixed model, some correlation structures differ between models with discrete and continuous times. SAS offers correlation structures especially designed for continuous data, as well as structures that were designed for data with discrete times. One trap and trick is the Estimate statement in Proc Mixed. For data at discrete times, time point coefficients are easily included in the Estimate statement. However, for polynomial models that contain time raised to various powers, the proper coding of time can make a difference between getting an "Inestimable error" versus a useful estimate. This presentation features an example with diabetes intervention data collected over several years with a linear mixed model containing a third degree polynomial for time. When time was originally coded in months, the estimate statement in Proc Mixed produced an "Inestimable error". When time was re-coded in years, the estimate statement generated useful information.

HS-36 : How to Navigate in a Maze of the Raking Macro with Advanced Weight Trimming
Michael Battaglia, Battaglia Consulting Group, LLC
David Izrael, Abt Associates
Sarah Ball, Abt Associates
Monday, 10:00 AM - 10:20 AM, Location: Room 204

Raking to population control totals is often the final step in developing survey weights. Raking is an iterative procedure that brings the weighted sample into agreement on socio-demographic variables that are available for the sample and the population. It is primarily used to reduce unit nonresponse bias. Raking can lead to some observations ending up with extreme weights; in other words, weights that are very large or very small compared to the mean weight, resulting in inflated standard errors. In 2009, we enriched a SAS® raking macro implementing weight trimming during the raking iterations, ensuring that the weighted sample agreed with the population. We recently further enhanced the macro adding several options related to weight trimming. Among them, two trimming methods - "AND" or '"OR" - and an option that allows us to set some different convergence criteria for a subset of the raking variables. This paper should help users to navigate among a number of options and parameters to more efficiently use the power of the raking macro with advanced weight trimming.

HS-38 : Assessing Model Adequacy in Proportional Hazards Regression
Michael G. Wilson, IUSM
Monday, 10:30 AM - 11:20 AM, Location: Room 204

Proportional Hazards regression has become an exceedingly popular procedure for conducting analysis on right-censored, time-to-event data. A powerful, numerically stable and easily generalizable model can result from careful development of the candidate model, assessment of model adequacy, and final validation. Model adequacy focuses on overall fitness, validity of the linearity assumption, inclusion of a correct (or exclusion an incorrect) covariate, and identification of highly-influential observations. Due to the presence of censored data and the use of the partial maximum likelihood function, diagnostics to assess these elements in proportional hazards regression compared to most linear modeling exercises can be slightly more complicated. In this paper, graphical and analytical methods using a rich supply of distinctive residuals to address these model adequacy challenges are compared.

HS-39 : Using SAS® to Validate Clinical Prediction Models
Xiaoting Wu, University of Michigan
Chang He, The Michigan Society of Thoracic and Cardiovascular Surgeons Quality Collaborative
Donald Likosky, Michigan Medicine
Monday, 5:00 PM - 5:20 PM, Location: Room 204

Model validation is an important step in establishing a clinical prediction model. Model validation process quantifies how well the model predicts outcomes for future patients. However, there were very few SAS programming examples showing the validation process. We previously developed a generalized mixed effect model that predicts peri-operative blood transfusion from patient characteristics. In this paper, we demonstrate the SAS® techniques that we used to validate such a model. These validation methods include calibration, discrimination and sensitivity analysis using bootstrapping method.

HS-45 : Pan-Cancer Epigenetic Biomarker Selection from Blood Sample Using SAS®
Xi Chen, University of Kentucky
Jin Xie, Department of Statistics, University of Kentucky
Qingcong Yuan, Department of Statistics, Miami University
Tuesday, 9:00 AM - 9:20 AM, Location: Room 204

A key focus in current cancer research is the discovery of cancer biomarkers that allow earlier detection with high accuracy and lower costs for both patients and hospitals. Blood samples have long been used as a health status indicator, but DNA methylation signatures in blood have not been appreciated in cancer research. Historically, analysis of cancer has been conducted directly with the patient's tumor or related tissues. Such analyses allow physicians to diagnose a patient's health and cancer status; however, physicians must observe certain symptoms that prompt them to use biopsies or imaging to verify the diagnosis. This is a post-hoc approach. Our study will focus on epigenetic information for cancer detection, specifically information about DNA methylation in human peripheral blood samples in cancer discordant monozygotic twin-pairs. This information might be able to help us detect cancer much earlier, before the first symptom appears. Several other types of epigenetic data can also be used, but here we demonstrate the potential of blood DNA methylation data as a biomarker for pan-cancer using SAS® 9.3 and SAS® EM. We report that 55 methylation CpG sites measurable in blood samples can be used as biomarkers for early cancer detection and classification. Keywords SAS, Epigenetic, Cancer Detection, Cancer Biomarker, PCA, SAS, Statistical Learning, Machine Learning

HS-53 : Creating suicide attempt/intentional self-harm episodes using administrative billing data
Jamie Kammer, New York State Office of Mental Health
Mahfuza Rahman, NYS OMH
Qingxian Chen, NYS OMH
Monday, 1:30 PM - 1:50 PM, Location: Room 204

In identifying the services that individuals received in the time prior to and following a suicide attempt or intentional self-harm episode, it is important to separate out those services that occurred as part of treatment during the emergency room or inpatient visit (including transfers) from services outside of the episode. Specifically, it is useful to understand the service utilization patterns individuals experienced directly prior to a suicide attempt, whether individuals were engaged in outpatient care immediately following emergency room or inpatient treatment, and whether individuals experienced separate suicide attempts outside of an index episode. These insights can inform public health efforts to prevent and treat suicide attempts and intentional self-harm. There are several considerations to take into account when using administrative billing data for public health research and there is a need for standardizing methods around identifying episodes carefully. This paper describes one method for linking together rows of administrative billing data into continuous suicide attempt/intentional self-harm episode(s) with begin and end dates as a first step in service utilization analyses. SAS 9.4 was used.

HS-71 : Leveraging SHEWHART Procedure Options to Monitor and Evaluate Improvements in Healthcare
Roderick Jones, Ann & Robert H. Lurie Children's Hospital of Chicago
Lynn (Xiaohong) Liu, Ann & Robert H. Lurie Childrens Hospital of Chicago
Monday, 2:00 PM - 2:50 PM, Location: Room 204

In healthcare, the purpose of statistical process control (SPC) is often to quantify improvements and identify unintended consequences resulting from an intentional change in an environment, policy, treatment protocol, or decision-support tool. SAS/QC facilitates the production of statistics and their visualization through the SHEWHART procedure. Unlike in manufacturing, process change - rather than stability - is commonly sought, and interventions might be frequent and staggered over time. Using PCHART (for a proportion metric) and XSCHART (for a mean time interval) as examples, we describe approaches to 1) defining a baseline period, and extending its centerline prospectively to apply special cause variation tests against it; 2) removing (or "ghosting") from baseline calculations any subgroups that are considered the result of special cause variation; 3) using TESTNMETHOD=STANDARDIZE when subgroup sample sizes are not constant; 4) using the TESTACROSS option to detect variation spanning phases; 5) leveraging the contents of output datasets.

HS-73 : A Macro to Import Subject Data Saved in a Location with Separate Subfolders for each Subject
Laurie Smith, Cincinnati Children's Hospital Medical Center
Tuesday, 1:30 PM - 1:50 PM, Location: Room 204

Often, in Health Sciences, subject data is saved as one file per subject. When subject data is saved in separate files, it is often difficult to import each separate file into SAS® without great effort. If data is organized with the subject ID as the folder name and each subject's data in the corresponding folder, this macro will allow a programmer with basic SAS® programming skills to read the subject ID's from the folder names, and loop though each subjects folder, importing all data within each folder, using SAS® v9.4 for Windows.

HS-74 : Use of SAS Macros to automate the Production of Statistical Process Control Charts
Lynn (Xiaohong) Liu, Ann & Robert H. Lurie Childrens Hospital of Chicago
Roderick Jones, Ann & Robert H. Lurie Children's Hospital of Chicago
Monday, 4:00 PM - 4:50 PM, Location: Room 204

The SAS/QC SHEWHART procedure generates statistics and statistical process control (SPC) charts used to measure improvement and special cause variation in a process. To produce output automatically on a repeating schedule for a large number of metrics, a system of sequential SAS programs with macro processing was developed and implemented. The foundation for the process is a parameter file, which stores information including each metric's definition, record-level dataset, variable name and label, temporal unit of analysis, starting point of the time period to be analyzed and SPC chart type. The parameter file is imported and each row (or "run") is converted into macro variables using %LET and %SYSFUNC. CALL SYMPUT, SQL SELECT INTO and %SYMEXIST assign or detect macro variables in real time, which allows a dynamic response from the system. The system of SAS programs acts as a series of pathways with each run directed according to its macro variables using %IF - %THEN/%ELSE logic. Each pathway has code to read record-level datasets, calculate necessary summary statistics and output the results as datasets using ODS OUTPUT for the formatting that's needed prior to running the SHEWHART procedure. PROC SHEWHART is applied iteratively, the number of times depending on OUTTABLE and OUTHISTORY dataset contents and the detection of shifts that require new phases to be defined. With a %DO loop the pathways culminate in the generation and delivery to document libraries of a SPC chart image files, metric description image file, and excel summary statistics file for each run.

HS-83 : A Macro to Calculate Sample Size for Studies Using the Proportional Time Assumption
Brian Mosier, University of Kansas Medical Center
John Keighley, University of Kansas Medical Center
Milind Phadnis, University of Kansas Medical Center
Tuesday, 10:30 AM - 10:50 AM, Location: Room 204

Sample size calculations for time-to-event outcomes are done mostly based on the assumption of proportional hazards or of exponentially distributed survival times. These assumptions are not appropriate for all scenarios and should not be implemented if the assumptions are not met. Phadnis et al1 introduces an alternative method using the assumption of proportional time by using the generalized gamma ratio distribution to calculate sample size. We developed a macro to calculate sample size needed for studies using the proportional time assumption for a given value of power in an efficient way. The macro automates the method from the paper to simulate survival data for two treatment arms with the test statistic following a generalized gamma ratio distribution. It then utilizes the bisection method in order to find the appropriate sample size needed for power, input by the user along with additional parameters. We have implemented various features in the macro, allowing for one or two sided tests and an option that graphs the power function (additional features). This macro is a tool statisticians can use to make sample size calculations for studies using the proportional time assumption when some form of historical information is available from a prior study. 1Phadnis MA, Wetmore JB, Mayo MS. A clinical trial design using the concept of proportional time using the generalized gamma ratio distribution. Statistics in Medicine. 2017;36:4121-4140 https://doi.org/10.1002/sim.7421

HS-88 : Baseline Mean Centering for Analysis of Covariance (ANCOVA) Method of Randomized Controlled Trial Data Analysis
Jennifer Scodes, New York State Psychiatric Institute
Tuesday, 8:30 AM - 8:40 AM, Location: Room 204

Many analytical approaches exist to compute treatment effects and within-group changes from baseline for data analysis of randomized controlled trials with multiple follow-up visits. One of these approaches is the analysis of covariance (ANCOVA) method, in which baseline values are included as a covariate instead of as an outcome. Using the ANCOVA method, the treatment effects can be easily computed from model estimates; however, within-group changes from baseline cannot be directly computed in SAS procedures without centering the outcome measures by the overall baseline mean. This paper will present a macro that can be used to analyze data from two-arm randomized controlled trials using the ANCOVA method to compute and present both treatment effects and within-group changes from baseline using baseline mean centering. This paper is intended for all levels of SAS users that analyze clinical trial data.

HS-94 : Genocide Modeling - Historical Risk Factors and Odds Ratios
David Corliss, Peace-Work
Tuesday, 9:30 AM - 10:20 AM, Location: Room 204

This analysis identifies risk factors associated with genocide events. A review of historical conflicts where genocide was present in some and not others provided the data. Using these data, Decision Tree and Random Forest models identify variables with measurable association with genocide events. Logistic Regression and Decision Tree methods are applied to the screened list of variables. Odds ratios are calculated to assess the relative risk of different factors. These models are used to assess the relative likelihood of genocide occurring or developing in the near year in various countries.

SAS 101 Plus

SP-100 : Using SASv9.cfg, autoexec.sas, SAS® Registry, and Options to Set Up Base SAS®
Peter Eberhardt, Fernwood Consulting Group Inc
Tuesday, 1:30 PM - 2:20 PM, Location: Room 205

Are you frustrated with manually setting options to control your SAS® Display Manager sessions but become daunted every time you look at all the places you can set options and window layouts? In this paper, we look at various files SAS accesses when starting, what can (and cannot) go into them, and what takes precedence after all are executed. We also look at the SAS Registry and how to programmatically change settings. By the end of the paper, you will be comfortable in knowing where to make the changes that best fit your needs.

SP-116 : Order, Order! Four Ways to Reorder Your Variables, Ranked by Elegance and Efficiency
Louise Hadden, Abt Associates Inc.
Tuesday, 10:30 AM - 10:40 AM, Location: Room 205

SAS(r) practitioners are frequently required to present variables in an output data set in a particular order, or standards may require variables in a production data set to be in a particular order. This paper and presentation offer several methods for reordering variables in a data set, encompassing both data step and procedural methods. Relative efficiency and elegance of the solutions will be discussed.

SP-139-SAS : Keeping Up to Date with ODS Graphics
Warren Kuhfeld, SAS
Tuesday, 11:00 AM - 11:50 AM, Location: Room 205

SAS 9.4M5 provides you with several enhancements to ODS Graphics including a new procedure. You can use PROC SGMAP to create maps and superimpose graphs such as bubble plots. Bar charts in existing procedures such as PROC SGPLOT have new options for fill patterns and fill types. New options for box plots enable you to display statistics and control the caps on whiskers. Other options enable you to modify tick labels, tick styles, legends, baselines, and reference line thickness. You can also control the image names when there are BY groups. This talk describes and illustrates these recent updates.

SP-2 : Making Your SAS® Output, Results, Reports, Charts and Spreadsheets More Meaningful with Color
Kirk Paul Lafler, Software Intelligence Corporation
Monday, 3:30 PM - 4:20 PM, Location: Room 205

Color can help make SAS® output, results, reports, charts and spreadsheets more professional and meaningful. Instead of producing boring and ineffective results, users can enhance the appearance of their output to highlight and draw attention to important data elements and issues, including headings, subheadings, footers, minimum and maximum values, ranges, outliers, special conditions, and other elements. Color can be added to text, foreground, background, row, column, cell, summary, and total with amazing color and traffic lighting scenarios. Topics include exploring an assortment of examples to illustrate the various ways output, documents, reports, charts and spreadsheets can be enhanced with color, effectively add color to PDF, RTF, HTML, and Excel spreadsheet results using PROC PRINT, PROC REPORT, PROC TABULATE, and PROC SGPLOT and Output Delivery System (ODS) with style.

SP-48 : Using Multilabel Formats with PROC SUMMARY to Generate Report Data with Overlapping Time Segments
John Schmitz, Luminare Data
Monday, 9:00 AM - 9:20 AM, Location: Room 205

SAS introduced the multi-label format (MLF) in Version 8. Yet, few users are familiar with the MLF or its unique capabilities. MLFs are used for data summarization where the same observation may be classified into 2 or more levels, simultaneously. This paper shows how multi-label formats can be used to generate time segments with overlapping periods. Core steps include creating the multi-label format definition, applying MLFs to CLASS variables within PROC SUMMARY, and properly understanding the results.

SP-49 : Data-driven Data Analysis
Jack Shoemaker, Texture Health
Monday, 9:30 AM - 10:20 AM, Location: Room 205

When confronted with a new data channel, the modern data scientist or analyst will employ sophisticated data visualization tools like Visual Analytics to size up the data. Not all users have access to these tools and must rely on more pedestrian code-based approaches. This paper explores techniques using Base SAS to provide data-driven data analysis to help size up data absent the more modern tools. These techniques leverage the details about data available from the dictionary subsystem. Knowing the names, formats, and data types of the data allows one to derive great insight into the content of the data stream.

SP-54 : Using your FREQ effectively: Displays to Decipher Proportional Odds in Ordinal Regression
Robert G. Downer, Grand Valley State University
Tuesday, 9:00 AM - 9:20 AM, Location: Room 205

The proportional odds assumption of the cumulative logit model is an intriguing challenge in the modeling of an ordinal response. Valuable insight for modeling decisions can be gained by further investigation of why the proportional odds assumption has been satisfied or not. This investigation can be exploratory and completely separate from the logistic modeling. Empirical cumulative logit plots are one possibility but the interpretation is not intuitive with respect to odds or proportions. This paper presents exploratory methods which enhance the toolbox for understanding the proportional odds test. In conjunction with other SAS procedures, effective tabular and graphical options of PROC FREQ are used to support the findings of the proportional odds test. An informal application of the Breslow Day Test is introduced. Model development is not a focus of the paper but some PROC LOGISTIC details are discussed.

SP-57 : A Macro that Can Get the Geo Coding Information from the Google Map API
Ting Sa, Cincinnati Children's Hospital Medical Center
Tuesday, 8:30 AM - 8:40 AM, Location: Room 205

This paper introduces a macro that can automatically get the geo coding information from the Google map API for the user. The macro can get the longitude, latitude, standard address and the address components like street number, street name, county or city name, state name, zip codes etc. for the user. To use the macro, the user only needs to provide a simple SAS input data, then the macro will automatically get the data and save them into a SAS data set for the user. The paper includes all the SAS codes for the macro and provides the input data example to show you how to use the macro.

SP-61 : Seeing the Forest for the Trees: Part Deux of Defensive Coding by Example
Donna Levy, Syneos Health
Nancy Brucken, Syneos Health
Monday, 1:30 PM - 1:50 PM, Location: Room 205

As statisticians and programmers, SAS® is part of our daily life. Through assessing patterns, data quality, programming datasets, analysis displays or developing simulations, we need to determine the best ways to conduct our daily work, allowing us to see the forest for the trees. This paper provides guidance on quality defensive programming, efficient coding as well as good programming concepts. Programming no no's will also be discussed. The concepts discussed will allow us to navigate through the trees --- that is, seeing the trees for the forest. We may have been programming in SAS for weeks, months, years or decades. Regardless, we should continue to expand our skills and continue learning and updating our techniques. With this paper, we will provide reminders for paths lost in the past, as well as new tips to help us clear the brush from the trail. This paper is part deux of Defensive Coding by Example (Brucken and Levy, 2015), quenching our thirst for adventure in the great SAS hinterland.

SP-62 : Efficiency Programming with Macro Variable Arrays
Veronica Renauldo, QST Consultations
Monday, 10:30 AM - 11:20 AM, Location: Room 205

Macros in themselves boost productivity and cut down on user errors. However, most macros are not robust and serve only a few specific repetitive purposes. Just like arrays increase the efficiency of a datastep, macro variable arrays increase the efficiency of a macro. Macro variable arrays allow the macro to function more autonomously than what is typical for macro processing and work in all SAS® platforms that support macro processing. Automating the process of determining the number of times a macro needs to be utilized for a task is just one of the several applications of macro variable arrays. There are numerous ways to create macro variable arrays such as %LET statements, PROC SQL, and CALL SYMPUT statements; each with their own user-friendly approach. Macro variable arrays employ the use of loops and logic to construct comprehensive macros allowing for a multitude of output types functioning within one macro call. Constructing dynamic macros will increase the capacity of a macro while dramatically decreasing the lines of code in each program. In conjunction with macro functions such as %SYSFUNC, %SCAN, and %STR, macro variable arrays allow the creator and user of a macro to be more flexible with their coding; ultimately leading to more productivity with less code alterations. Impress your boss, your friends, and yourself with macro code that almost writes itself.

SP-63 : Dating for SAS Programmers
Josh Horstman, Nested Loop Consulting
Monday, 2:00 PM - 2:50 PM, Location: Room 205

Every SAS programmer needs to know how to get a date... no, not that kind of date. This paper will cover the fundamentals of working with SAS date values, time values, and date/time values. Topics will include constructing date and time values from their individual pieces, extracting their constituent elements, and converting between various types of dates. We'll also explore the extensive library of built-in SAS functions, formats, and informats for working with dates and times using in-depth examples. Finally, you'll learn how to answer that age-old question... when is Easter next year?

SP-66 : Merge with Caution: How to Avoid Common Problems when Combining SAS Datasets
Josh Horstman, Nested Loop Consulting
Monday, 4:30 PM - 5:20 PM, Location: Room 205

Although merging is one of the most frequently performed operations when manipulating SAS data sets, there are many problems which can occur, some of which can be rather subtle. This paper examines several common issues, provides examples to illustrate what can go wrong and why, and discusses best practices to avoid unintended consequences when merging.

SP-69 : What's the Difference? Using the PROC COMPARE to find out.
Larry Riggen, Indiana University
Monday, 11:30 AM - 11:50 AM, Location: Room 205

We are often asked to determine what has changed in a database. There are many tools that can provide a list of before and after differences (e.g. Redgate Data Compare), but SAS PROC COMPARE can be coupled with other tools in base SAS to analyze the changes. This paper will explore using the output file produced by PROC COMPARE and the SAS Macro language to produce spreadsheets of detailed differences and summaries to perform this task.

SP-75 : From Clicking to Coding: Using ODS Graphics Designer as a Tool to Learn Graph Template Language
Margaret Kline, Grand Valley State University
Daniel Muzyka, Grand Valley State University
Tuesday, 9:30 AM - 9:50 AM, Location: Room 205

ODS Graphics Designer brings simple graphics creation to SAS platforms 9.2 and later. This application enables any novice user who can navigate an interactive point-and-click menu to generate highly customizable graphical representations. ODS Graphic Designer which functions in conjunction with the suite of SAS products can be invoked to facilitate the creation of Graph Template Language (GTL) through a non-intimidating interface. Not only can this code be edited at a subsequent time but providing novice users with the instant gratification of a striking graphic display could encourage the continued expansion of SAS skills or ease the transition from other software. There is untapped potential in ODS Graphics Designer as an educational tool which exists in its ability to acquaint users with the underlying syntax of GTL. This paper describes how to access and navigate the user interface, provides examples of generated and edited code, and discusses potential uses and limitations to showcase the ability of ODS Graphics Designer as a pedagogical tool for beginner to intermediate programmers.

SP-76 : Tips, Traps, and Techniques in BASE SAS for vertically combining SAS data sets
Jayanth Iyengar, Data Systems Consultants LLC
Tuesday, 10:00 AM - 10:20 AM, Location: Room 205

Although not as frequent as merging, a data manipulation task which SAS programmers are required to perform is vertically combining SAS data sets. The SAS system provides multiple techniques for appending SAS data sets, which is otherwise known as concatenating, or stacking. There are pitfalls and adverse data quality consequences for using traditional approaches to appending data sets. There are also efficiency implications with using different methods to append SAS data files. In this paper, with practical exampes, I examine the technical procedures that are necessary to follow to prepare data to be appended. I also compare different methods that are available in BASE SAS to append SAS data sets, based on efficiency criteria.

SP-78 : Improving Plots Using XAXISTABLE and YAXISTABLE
Jacob Keeley, Grand Valley State University
Carl Nord, Eli Lilly and Company
Tuesday, 8:45 AM - 8:55 AM, Location: Room 205

New to the SGPLOT procedure for SAS 9.4, the XAXISTABLE and YAXISTABLE statements respectively create an X/Y axis aligned row of textual data placed at specific locations in relation to the primary plot within the given SGPLOT procedure. The XAXISTABLE and YAXISTABLE statements are applicable with any primary plot, aside from BAND, BLOCK, FRINGE, REG, LOESS, and PBSPLINE plots. Along with directing the X, Y coordinates of the supplementary data values, there are many options accompanying the XAXISTABLE/YAXISTABLE statements which allow the user to change the color, order, and position of the accompanying row(s) of data. The XAXISTABLE statement proves its worth when used in conjunction with survival analysis. When dealing with Kaplan Meier survival curves, the XAXISTABLE statement essentially allows the user to personalize their own at risk tables when the LIFETEST procedure lacks the functionality necessary for the request. In this framework, the XAXISTABLE improves greatly upon what would have previously been an arduous task. Overall, the XAXISTABLE and YAXISTABLE statements are a welcome addition to the SGPLOT syntax, as the user is given even more control over the appearance of a desired plot, making it less likely that the plot needs to be altered after it has been output.

SAS 301 Beyond the Basics

SB-10 : A Quick Look at Fuzzy Matching Programming Techniques Using SAS® Software
Kirk Paul Lafler, Software Intelligence Corporation
Stephen Sloan, Accenture
Monday, 2:30 PM - 2:50 PM, Location: Room 206

Data comes in all forms, shapes, sizes and complexities. Stored in files and data sets, SAS® users across industries know all too well that data can be, and often is, problematic and plagued with a variety of issues. When unique and reliable identifiers, referred to as the key, are available, users routinely are able to match records from two or more data sets using merge, join, and/or hash programming techniques without problem. But, when a unique and reliable identifier is not available, or does not exist, then one or more fuzzy matching programming techniques must be used. Topics include introducing what fuzzy matching is along with examples of the SOUNDEX (for phonetic matching) algorithm, and the SPEDIS, COMPLEV, and COMPGED functions to resolve key identifier issues and to successfully merge, join and match less than perfect or messy data.

SB-102 : Speed up your Data Processing with SAS Code Accelerator.
Paul Segal, Teradata
Tuesday, 1:30 PM - 1:50 PM, Location: Room 206

SAS® In-Database Code Accelerator enables DS2 code to execute inside the database without translation to another language (such as SQL). This enables your data preparation steps to be dramatically accelerated, as you can now make use of the multi-threading capabilities in a massively parallel architected platform (such as the Teradata relational database management system [RDBMS] or the Apache Hadoop platform). In this short presentation, we introduce those of you unfamiliar with DS2 to the new features as well as demonstrate how performant it can be by running a live demonstration on the Teradata RDBMS.

SB-114 : Wow! You Did That Map With SAS®?! Round II
Louise Hadden, Abt Associates Inc.
Tuesday, 2:00 PM - 2:50 PM, Location: Room 206

This paper explores the creation of complex maps with SAS® software. This presentation incorporates explores the wide range of possibilities provided by SAS/GRAPH and polygon plots in the SG procedures, as well as replays, overlays in both SAS/GRAPH and SG procedures, and annotations including Zip Code level processing. The more recent GfK maps now provided by SAS, that underlie newer SAS products such as Visual Analytics as well as traditional SAS products, will be discussed. The pre-production SGMAP procedure released with Version 9.4 Maintenance release 5 will be discussed in context.

SB-140-SAS : Square Peg, Square Hole-Getting Tables to Fit on Slides in the ODS Destination for PowerPoint
Jane Eslinger, SAS
Monday, 1:30 PM - 2:20 PM, Location: Room 206

An output table is a square. A slide in Microsoft PowerPoint is a square. The table, being the smaller square, should fit in the bigger square slide. Right? Well, not always. Despite the programmers expectations, some tables will not fit on the slide created by the ODS destination for PowerPoint. It depends on the table. For instance, tables with, say, more than 10 rows or more than 6 columns might end up spanning multiple slides. But, just as with the popular childrens toy, by twisting, turning, or approaching the hole from a different angle, you can get the peg in the hole. This paper discusses three programming strategies for getting your tables to fit on slides: changing style attributes to decrease the amount of space needed for the table, strategically dividing one table into multiple tables, and using ODS output data sets for greater control over the structure of the tables. Throughout this paper, you will see examples that demonstrate how to apply these strategies using the popular procedures TABULATE, REPORT, FREQ, and GLM.

SB-141-SAS : Advanced ODS Graphics Examples
Warren Kuhfeld, SAS
Monday, 9:00 AM - 10:50 AM, Location: Room 206

You can use SG annotation, modify templates, and change dynamic variables to customize graphs in SAS. Standard graph customization methods include template modification (which most people use to modify graphs that analytical procedures produce) and SG annotation (which most people use to modify graphs that procedures such as PROC SGPLOT produce). However, you can also use SG annotation to modify graphs that analytical procedures produce. You begin by using an analytical procedure, ODS Graphics, and the ODS OUTPUT statement to capture the data that go into the graph. You use the ODS document to capture the values of dynamic variables, which control many of the details of how the graph is created. You can modify the values of the dynamic variables, and you can modify graph and style templates. Then you can use PROC SGRENDER along with the ODS output data set, the captured or modified dynamic variables, the modified templates, and SG annotation to create highly customized graphs. This paper shows you how and introduces SG annotation and axis tables. This tutorial is based on the free web book: http://support.sas.com/documentation/prod-p/grstat/9.4/en/PDF/odsadvg.pdf. *Prior experience with ODS Graphics is assumed. Skill Level: Intermediate

SB-145 : Perl Regular Expression - The Power to Know the PERL in Your Data
Kaushal Chaudhary, Eli Lilly and Company
Dhruba Ghimire, Eli Lilly and Company
Tuesday, 11:00 AM - 11:20 AM, Location: Room 206

Perl regular expression is one of the powerful and efficient techniques for complex string data manipulation. SAS® offers regular expression engine in the base SAS without any additional license requirement. This would be a great addition to a SAS programmers' toolbox. In this paper, we present basics of the Perl regular expression and various Perl regular functions and call routine such as PRXPARSE(), PRXMATCH(), and CALL PRXCHANGE etc. with examples. The presentation is intended for beginner and intermediate SAS programmers.

SB-19 : Visual Storytelling - The Art of Communicating Information with Graphics
Kirk Paul Lafler, Software Intelligence Corporation
Tuesday, 9:00 AM - 9:50 AM, Location: Room 206

Telling a story with facts alone can be boring, while stories told visually engage. It's been said that humans tend to process visual elements many times faster than reading words. The data analysis process involves the gathering and collection, cleansing, transforming, modeling and storytelling of data from various sources. The objective is to discover, evaluate, understand and derive useful information from the data to support decision-making. Unfortunately, data analysts sometimes omit a very crucial step - the development of a visual narrative about the data analysis process and outcome. This omission not only fails to bring context, insight and interpretation of the data analysis results in a clear and precise way, it neglects to bring meaning, relevance and interest to the "key" points of the data analysis results. Topics include describing the importance, considerations and steps needed to develop a compelling narrative with visuals; communicate a convincing point-of-view by letting your visuals do the talking; help your audience see hidden, or hard to see, things in your data; how to avoid the obvious by surprising and engaging your audience; techniques on how to share a lasting message by teaching something; and examine a variety of visuals and graphics to persuade your audience to understand the complexities associated with the data analysis results.

SB-34 : Backsplash patterns for your world: A look at SAS OpenStreetMap (OSM) tile servers
Barbara Okerson
Tuesday, 11:30 AM - 11:50 AM, Location: Room 206

Originally limited to SAS Visual Analytics, SAS now provides the ability to create background maps with street and other detail information in SAS/GRAPH® using open source map data from OpenStreetMap (OSM). OSM provides this information using background tile sets available from various tile servers, many available at no cost. This paper provides a step-by-step guide for using the SAS OSM Annotate Generator (the SAS tool that allows use of OSM data in SAS). Examples include the default OpenStreetMap tile server for streets and landmarks, as well as how to use other free tile sets that provide backgrounds ranging from terrain mapping to bicycle path mapping. Dare County, North Carolina is used as the base geographic area for this presentation.

SB-40 : Conversion of CDISC specifications to CDISC data - specifications driven SAS programming for CDISC data mapping
Yurong Dai, Eli Lilly
Jiangang Jameson Cai, Eli Lilly
Monday, 4:30 PM - 4:50 PM, Location: Room 206

This is for a metadata driven approach that utilize SAS programming techniques for SDTM and ADaM data mapping. Metadata extracted from specifications are converted into dataset's attributes, format, variable names and their order and sorting order for specification implementation in our reference code. It increases code's reusability, efficiency and consistency between data specifications and output data, and reduced re-work after data specification's update, during code development for SDTM mapping and ADaM datasets derivation.

SB-52 : Show Me That? Using SAS VIYA, Visual Analytics and Free ESRI Maps to Show Geographic Data
John Schmitz, Luminare Data
Monday, 4:00 PM - 4:20 PM, Location: Room 206

Visual Analytics includes features to connect to free, premium and custom ESRI map capabilities to display geographic information. This paper provides a simple example for generating a shaded state map, based on input data and free ESRI map capabilities. The paper reviews key configuration settings that impact ESRI map capabilities, generation and promotion of data for use by the geo-map feature, defining the category field for use with geo-mapping, filtering graph data, and producing a state-level shaded map.

SB-59 : Finding National Best Bid and Best Offer - Quote by Quote
Mark Keintz, Wharton Research Data Services
Monday, 11:00 AM - 11:50 AM, Location: Room 206

U.S. stock exchanges (currently there are 12) are tracked in real time via the Consolidated Trade System (CTS) and the Consolidated Quote System (CQS). CQS contains every updated quote from each of these exchanges, covering some 8,500 stock tickers. It provides the basis by which brokers can honor their fiduciary obligation to investors to execute transactions at the best price, i.e. at the NBBO (National Best Bid or Best Offer). With the advent of electronic exchanges and high frequency trading (timestamps are published to the microsecond), data set size (approaching 1 billion quotes requiring 80 gigabytes of storage for a normal trading day) has become a major operational consideration for market behavior researchers recreating NBBO values. This presentation demonstrates a straightforward use of hash tables for tracking constantly changing quotes for each ticker/exchange combination to provide the NBBO for each ticker at each time point in the trading day.

SB-60 : From Stocks to Flows: Using SAS® HASH objects for FIFO, LIFO, and other FO's
Mark Keintz, Wharton Research Data Services
Tuesday, 10:00 AM - 10:50 AM, Location: Room 206

Tracking gains or losses from the purchase and sale of diverse equity holdings depends in part on whether stocks sold are assumed to be from earliest lots acquired (a FIFO queue) or the latest lots acquired (LIFO). Other inventory tracking applications have a similar need for application of either FIFO or LIFO rules. This presentation shows how a collection of simple ordered hash objects, in combination with a hash-of-hashes is a made-to-order technique for easy data-step implementation of FIFO, LIFO, and other less-likely rules, like HIFO (highest price first out) and LOFO (lowest price).

SB-89 : Analyzing Amazon's Customer Reviews using SAS® Text Miner for Devising Successful Product Launch Strategies
Manideep Mellachervu, Oklahoma State University
Anvesh Reddy Minukuri, Comcast Corporation
Tuesday, 8:30 AM - 8:50 AM, Location: Room 206

Digital economy is showing a tremendous growth in the 21st century and it is having a massive impact in the current society. E-commerce is one element of the Internet of Things and its worldwide sales amounted to 2 trillion USD dollars. This shows the popularity of online shopping and it also implies the evolving of retailers in this Industry. A recent study conducted by GE Capital Retail Bank has found that 81% of consumers perform online research before buying products.This tells that consumers rely heavily on others' opinions and experiences in order to buy a product. Businesses need to understand customers' view of their products and also competitors' products for strategic marketing. E-commerce businesses provide a platform to generate user-experience content through customer. Customer reviews are vital for a buyer to choose the best product out of numerous similar products available in the market. Companies need to analyze the customers' perspective through reviews for a better business, evaluate customer engagement, and devise strategies for the launch of their products. This paper focuses on analyzing the customer reviews primarily on Amazon using Python, SAS Text Miner, SAS Sentiment Analysis and SAS Visual Studio. This project will determine which product features are given high-ratings/low-ratings, how the high-rating features of a best-selling product are performing compared to a similar product that is sold by a different vendor; and how to account for the customers' perception to product price of different brands while launching a similar new product.

SB-93 : Quality Control for Big Data: How to Utilize High Performance Binning Techniques
Deanna Schreiber-Gregory, Henry M Jackson Foundation for the Advancement of Military Medicine
Karlen Bader, Henry M Jackson Foundation for the Advancement of Military Medicine
Monday, 3:30 PM - 3:50 PM, Location: Room 206

It is a well-known fact that the structure of real-world data is rarely complete and straightforward. Keeping this in mind, we must also note that the quality, assumptions, and base state of the data we are working with has a strong influence on the selection and structure of the statistical model chosen for analysis and/or data maintenance. If the structure and assumptions of the raw data are altered too much, then the integrity of the results as a whole are grossly compromised. The purpose of this paper is to provide programmers with a simple technique which will allow the aggregation of data without losing information. This technique will also check for the quality of binned categories in order to improve the performance of statistical modeling techniques. The SAS® high performance analytics procedure, HPBIN, gives us a basic idea of syntax as well as various methods (Bucket, Winsor, Quantile, and Pseudo_Quantile), tips, and details on how to bin variables into comprehensible categories. We will also learn how to check whether these categories are reliable and realistic by reviewing the WOE (Weight of Evidence), and IV (Information Value) for the binned variables. This paper is intended for any level of SAS User interested in quality control and/or SAS high performance analytics procedures.

SAS Super Demos

SD-149-SAS : Creating a Custom Task in SAS Studio
Danny Modlin, SAS
Monday, 9:00 AM - 9:20 AM, Location: Room 202

Whether you are using SAS Studio in its full version or through SAS University Edition, you will notice that already created tasks are included to help you the user to generate code to do several different things in SAS. Have you ever wanted to alter any of these tasks or maybe even create one of your own? In this Super Demo, we will discuss the means of editing and creating your own SAS Studio tasks to use and share with others.

SD-150-SAS : Executing Open Source Code in Machine Learning Pipelines of SAS Visual Data Mining and Machine Learning
Brett Wujek, SAS
Monday, 10:00 AM - 10:20 AM, Location: Room 202

Learn how to incorporate open-source code into your machine learning pipelines to integrate and compare models.

SD-151-SAS : Tune In to Model Tuning
Brett Wujek, SAS
Monday, 2:30 PM - 2:50 PM, Location: Room 202

Learn how to build better models faster with the latest advancements in automated hyperparameter tuning in SAS® Visual Data Mining and Machine Learning.

SD-152-SAS : Highly Customized Graphs Using ODS Graphics
Warren Kuhfeld, SAS
Monday, 3:30 PM - 3:50 PM, Location: Room 202

Learn how to use the ODS document, PROC TEMPLATE, PROC SGRENDER, a DATA step, and SG annotation to customize every component of the graphs that are produced by analytical procedures.

SD-153-SAS : Heat Maps: Graphically Displaying Big Data and Small Tables
Warren Kuhfeld, SAS
Tuesday, 9:00 AM - 9:20 AM, Location: Room 202

Learn how to use heat maps in graphs, maps, and tables in ODS Graphics. Also learn how to highlight cells in tables in ODS.

SD-154-SAS : What's New in the ODS Excel Destination
Jane Eslinger, SAS
Monday, 11:00 AM - 11:20 AM, Location: Room 202

This demo highlights some of the newer features of the ODS Excel destination along with reasons to move to the ODS Excel destination if you have not already.

SD-155-SAS : Creating Pivot tables using ODS Markup
Jane Eslinger, SAS
Tuesday, 10:00 AM - 10:20 AM, Location: Room 202

This demo demonstrates how quickly you can generate pivot tables and pivot graphs from your SAS data. Also demonstrated is the ability to automate this process by creating a task using SAS Studio to generate pivot tables and graphs.

SD-156-SAS : SAS 9.4 ODS in a Nutshell
Cynthia Zender, SAS
Monday, 1:30 PM - 1:50 PM, Location: Room 202

Come to this Super Demo to learn the new features of ODS in SAS 9.4. In a nutshell, you'll see examples of using ODS LAYOUT, creating lists and text blocks, using the Report Writing Interface. In addition other topics will include examples of Cascading StyleSheets, and using HTML5 as an ODS destination, as well as examples of ODS PowerPoint and ODS ePUB.

SD-157-SAS : Accessibility with ODS Output
Cynthia Zender, SAS
Tuesday, 11:00 AM - 11:20 AM, Location: Room 202

Creating sophisticated, visually stunning reports is imperative in today's business environment, but is your fancy report really accessible to all? Let's explore some simple enhancements that were made in the fourth maintenance release of SAS® 9.4 to Output Delivery System (ODS) that will truly empower you to accommodate people who use assistive technology. ODS now provides the tools for you to meet Section 508 compliance and to create an engaging experience for all who consume your reports.

SD-158-SAS : The Future of SAS Enterprise Guide and SAS Studio
Amy Peters, SAS Institute
Monday, 4:30 PM - 4:50 PM, Location: Room 202

Get insights into the roadmap for the two interfaces and how they are converging.

Statistics / Advanced Analytics

AA-109 : Application of heavy-tailed distributions using PROC IML, NLMIXED and SEVERITY
Palash Sharma, University of Kansas Medical Center
Tuesday, 10:30 AM - 10:50 AM, Location: Room 203

Probabilistic heavy-tailed distribution (Pareto, Weibull, Burr etc.) theory has vast applications involving in many real-life situations and natural phenomena. This area of research attracts not only for theoretical probabilistic nature but also for various branches of statistics. Heavy-tailed distributions are also used for modeling various biological, actuarial, financial, economic, hydrological, and engineering data. In this paper, we are fitting the dataset of the number of customers is affected by electrical blackouts in the USA using Pareto distribution. We also simulate the data arising from Pareto distribution and estimate the parameters of Pareto distribution using maximum likelihood estimation. A suite of SAS procedure is used for all computation, specifically Procedure IML, SEVERITY, NLMIXED.

AA-117 : An Introduction to the process of improving a neural network
Yuting Tian, 7326090713
Tuesday, 8:30 AM - 8:50 AM, Location: Room 203

This paper is a follow-up on an earlier paper done on deep neural nets. An early paper, by Lavery, explored the theory of deep neural nets and, at the end of the paper, showed two quick examples. One example, and the author admitted his example was just to show code, used a deep neural net to predict loan defaults. The initial model had only fair results. This paper is a follow-up to illustrate the process of using SAS tools to improve a network where it is hoped accuracy can be improved.

AA-120 : Handling Missing Data in Exploratory Factor Analysis Using SAS
Min Chen, Cook Research Inc.
Monday, 1:30 PM - 1:50 PM, Location: Room 203

Handling Missing Data in Exploratory Factor Analysis Using SAS Exploratory Factor Analysis (EFA) is a statistical technique to reduce the dimension of data and to explore the latent structure within the data. Missing data is almost inevitable while conducting EFA. By default, the SAS procedure will only include the complete cases which most of the time it is not the first choice of researchers. Given EFA could be performed on individual-level data, correlation or covariance matrix, different formats of data could be fed into SAS and different missing data techniques could be applied. This article will demonstrate the above with SAS examples, and briefly comment on how this is generally handled in other statistical software.

AA-121 : How to Score Big with SAS Solutions: Various Ways to Score New Data with Trained Models
Scott Koval, Pinnacle Solutions, Inc
Tuesday, 9:00 AM - 9:20 AM, Location: Room 203

After training a statistical model, the next step is to put it into production in order to score new data. While it might be tempting to manually write code to score data, this can lead to problems with precision, complexity, and updating. SAS solutions offers a wide variety of methods to do this. This paper covers several common techniques, including PROC SCORE, the code statement, and PROC ASTORE. By learning these approaches, SAS can easily put complex models to work.

AA-137-SAS : Getting Started with Bayesian Analytics
Danny Modlin, SAS
Tuesday, 9:30 AM - 10:20 AM, Location: Room 203

The presentation will give a brief introduction to Bayesian Analysis within SAS. Participants will learn the difference between Bayesian and Classical Statistics and be introduced to PROC MCMC.

AA-138-SAS : Introduction to Machine Learning in SAS
Brett Wujek, SAS
Tuesday, 1:30 PM - 2:20 PM, Location: Room 203

This presentation answers the questions of What is Machine Learning? And What does SAS offer for Machine Learning? Examples of specific machine learning techniques such as Random Forest, Gradient Boosting, Support Vector Machines, Neural Networks and K-means are covered.

AA-29 : Automatic Indicators for Dummies: A macro for generating dummy indicators from category type variables
Matthew Bates, Affusion Consulting
Monday, 9:00 AM - 9:20 AM, Location: Room 203

Dummy Indicators are critical to building many statistical models based on data with category type predictors. Most programmers rely on the "class" option within various procedures to temporarily build such predictors behind the scenes. This method carries with it a variety of limitations that can be overcome by auto-generating dummy indicators of all variables below a reasonable threshold of cardinality prior to running such procedures. Statistical modelers may find this topic a real effort and time saver while advanced SAS programmers looking for creative techniques of efficiently automating processes may find this macro worth geeking out over.

AA-30 : Confounded? This example shows how to use SAS chi-square tests, correlations and logistic regression to unconfound a result.
Michael Grierson, Self
Monday, 9:30 AM - 9:50 AM, Location: Room 203

The purpose of this paper is to describe an example of how to unconfound a confounded statistical result1 and to present a recipe for unconfounding an analytic conclusion. The confounded result is the conclusion that since African American student loan borrowers are more likely to default on their student loans, that the Department of Education "cannot ignore the interaction of race and student loans". This paper shows that student loan defaults are more (by about 5 times) associated with lower median income status than race.

AA-31 : Screening, Transforming, and Fitting Predictors for Cumulative Logit Model
Bruce Lund, Consultant for Magnify Analytic Solutions
Monday, 10:00 AM - 10:50 AM, Location: Room 203

The cumulative logit model is a logistic regression model where the target has 2 or more ordered levels. If only 2 levels, then the cumulative logit is the binary logistic model. Predictors for the cumulative logit model might be "NOD" (nominal, ordinal, discrete) where typically the number of levels is under 20. Alternatively, predictors might be "continuous" where the predictor is numeric and has many levels. This paper discusses methods that pre-screen and transform both NOD and continuous predictors before the stage of model fitting. Once a collection of predictors has been screened and transformed, the paper discusses predictor variable selection for model fitting. One focus of this paper is determining when a predictor should be allowed to have unequalslopes. If unequalslopes are allowed, then the predictor has J 1 distinct slopes corresponding to the J values of the target variable. SAS® macros are presented which implement the screening and transforming methods. Familiarity with PROC LOGISTIC is assumed.

AA-41 : Alternative methods of regression when OLS is not right.
Peter Flom, Peter Flom Consulting
Monday, 2:00 PM - 2:50 PM, Location: Room 203

Ordinary least square regression is one of the most widely used statistical methods. However, it is a parametric model and relies on assumptions that are often not met. Alternative methods of regression for continuous dependent variables relax these assumptions in various ways. This paper will explore PROCS such as QUANTREG, ADAPTIVEREG and TRANSREGfor these data.

AA-42 : An introduction to classification and regression trees with PROC HPSPLIT.
Peter Flom, Peter Flom Consulting
Tuesday, 2:30 PM - 3:20 PM, Location: Room 203

Classification and regression trees are extremely intuitive to read and can offer insights into the relationships among the IVs and the DV that are hard to capture in other methods. I will introduce these methods and illustrate their use with PROC HPSPLIT.

AA-47 : Propensity Scores and Causal Inference for (and by) a Beginner
Bruce Lund, OneMagnify
Monday, 3:30 PM - 4:20 PM, Location: Room 203

In an observational study the subjects are assigned to treatments through a non-randomized process. In the simplest and most typical case there are two treatments, often one is deemed as "control". Associated with the subjects is an "outcome" which is of interest to the researcher. The outcome could be discrete, very often binary, or have continuous numeric values. The researcher wants to know the effect of the treatment on the outcome. But due to the non-random assignment of treatments, a simple comparison of outcomes, such as an average per treatment group, would be biased. One solution to removing the bias rests on finding covariates for the subjects such that the treatment can be regarded as random for subjects having essentially equal covariate values. Once accomplished, then an analysis of outcomes can be performed. Two SAS® procedures, PROC PSMATCH and PROC CAUSALTRT, conduct the analysis of covariates and analysis of outcomes so that a causal effect can be estimated. This paper provides an introductory discussion of the analysis of causal effects in observational studies and gives examples of usage of PSMATCH and CAUSALTRT. Other books or papers should be referenced for the advanced theory and details of the methodology.

AA-91 : Logistic and Linear Regression Assumptions: Violation Recognition and Control
Deanna Schreiber-Gregory, Henry M Jackson Foundation for the Advancement of Military Medicine
Karlen Bader, Henry M Jackson Foundation for the Advancement of Military Medicine
Monday, 11:00 AM - 11:50 AM, Location: Room 203

Regression analyses are one of the first steps (aside from data cleaning, preparation, and descriptive analyses) in any analytic plan, regardless of plan complexity. Therefore, it is worth acknowledging that the choice and implementation of the wrong type of regression model, or the violation of its assumptions, can have detrimental effects to the results and future directions of any analysis. Considering this, it is important to understand the assumptions of these models and be aware of the processes that can be utilized to test whether these assumptions are being violated. Given that logistic and linear regression techniques are two of the most popular types of regression models utilized today, these are the are the ones that will be covered in this paper. Some Logistic regression assumptions that will reviewed include: dependent variable structure, observation independence, absence of multicollinearity, linearity of independent variables and log odds, and large sample size. For Linear regression, the assumptions that will be reviewed include: linearity, multivariate normality, absence of multicollinearity and auto-correlation, homoscedasticity, and measurement level. This paper is intended for any level of SAS® user. This paper is also written to an audience with a background in theoretical and applied statistics, though the information within will be presented in such a way that any level of statistics/mathematical knowledge will be able to understand the content.

AA-92 : Regularization Techniques for Multicollinearity: Lasso, Ridge, and Elastic Nets
Deanna Schreiber-Gregory, Henry M Jackson Foundation for the Advancement of Military Medicine
Karlen Bader, Henry M Jackson Foundation for the Advancement of Military Medicine
Tuesday, 11:00 AM - 11:50 AM, Location: Room 203

Multicollinearity can be briefly described as the phenomenon in which two or more identified predictor variables are linearly related, or codependent. The presence of this phenomenon can have a negative impact on an analysis as a whole and can severely limit the conclusions of a research study. In this paper, we will briefly review how to detect multicollinearity, and once it is detected, which regularization techniques would be the most appropriate to combat it. The nuances and assumptions of R1 (Lasso), R2 (Ridge Regression), and Elastic Nets will be covered in order to provide adequate background for appropriate analytic implementation. This paper is intended for any level of SAS® user. This paper is also written to an audience with a background in theoretical and applied statistics, though the information within will be presented in such a way that any level of statistics/mathematical knowledge will be able to understand the content.

e-Posters

PO-111 : How to Avoid Possible Tricks When Using DATA STEP MERGE Instead of PROC SQL JOIN
Guangtao Gao, Cleveland State University
Tuesday, 8:30 AM - 8:50 AM, Location: Griffin

When we merge or join large data files, PROC SQL JOIN can avoid lots of potent troubles, but it costs too much execution time; DATA STEP MERGE can run much faster, however it has some potential tricks. If we can carefully avoid these tricks, DATA STEP MERGE would be a better choice. How to avoid these potential tricks would be our following topic. Here I also correct one popular beautiful method to modify the different lengths of variables directly using length statement.

PO-115 : Purrfectly Fabulous Feline Functions
Louise Hadden, Abt Associates Inc.
Monday, 9:00 AM - 9:20 AM, Location: Griffin

Explore the fabulous feline functions and calls available in SAS® 9.1 and later. Using CAT functions and CAT CALLs gives you an easier way to streamline your SAS code and facilitate concatenation of character strings. So, leave verbose coding, myriad functions, and the vertical bar concatenation operators behind! SAS® 9.2 (and beyond) enhancements will also be demonstrated.

PO-136 : Addressing Opioid Crisis using Data Science
Rishabh Mishra, Oklahoma State University
Tuesday, 10:30 AM - 10:50 AM, Location: Griffin

Opioids are commonly prescribed medication by doctors mainly used for treating acute and chronic pain. They are highly addictive and patients tend to become tolerant to the drug after a certain point in time. It means that patients either have to increase the dosage of the drug or stop taking it and both have their own set of disadvantages. On one hand, an overdose from these drugs can be fatal and on the other hand stopping these drugs can cause severe withdrawal symptoms and recurrence of pain. In this paper, we will use opioid data to predict the death ratio in each state across the United States. We will be using prescription data, patient survey data and death data for this analysis. If we can accurately predict the states with high death rates then it is possible that governmental action can be taken to avoid deaths in those states. From www.cdc.gov website, we obtained an overdose dataset, which contains prescriptions made by various physicians, and death rates by state for all deaths caused by opioid overdose. This project will use SAS Enterprise Guide and SAS Enterprise Miner to conduct predictive analysis using methods like decision tree, logistic regression, and random forest.

PO-32 : Great Time to Learn GTL
Richann Watson, DataRich Consulting
Kriss Harris, SAS Specialists Ltd
Monday, 11:30 AM - 11:50 AM, Location: Griffin

It's a Great Time to Learn GTL! Do you want to be more confident when producing GTL graphs? Do you want to know how to layer your graphs using the OVERLAY layout and build upon your graphs using multiple LAYOUT statement? This paper guides you through the GTL fundamentals!

PO-44 : Self-service utility to List and Terminate SAS grid jobs
Venkateswarlu Toluchuri, Tech Lead SAS Administrator
Monday, 2:45 PM - 3:05 PM, Location: Griffin

SAS® programmers always have difficulty to find their submitted jobs information, and they always depend on SAS® interactive client tools like SAS® EG and Putty sessions to terminate them, in most of the cases SAS® admin has to be involved to clean them up. The solution is to develop a self-service utility, so that the programmer can list & kill the jobs that are no longer required, with this approach it would also improve overall performance of an environment and there is no dependency on SAS® admins to kill the user jobs.

PO-51 : An Update on the CS Standard Analyses and Code Sharing Working Group
Nancy Brucken, Syneos Health
Jared Slain, MPI Research
Tuesday, 9:30 AM - 9:50 AM, Location: Griffin

The Standard Analyses and Code Sharing Computational Science Working Group is providing recommendations for analyses, tables, figures, and listings for data that are common across therapeutic areas (laboratory measurements, vital signs, electrocardiograms, adverse events, demographics, medications, disposition, hepatotoxicity, pharmacokinetics) in the pharmaceutical industry. Ten white papers are at various stages of development, including 6 that have been finalized. The latest white paper to be published covers analyses and displays for adverse events. The working group also created an online platform for sharing code. The code repository contains a wealth of scripts that have been written by PhUSE members or donated by other organizations. Crowd-sourcing code development will enable consistent interpretation of methods and substantial savings in resourcing across the industry. This presentation will provide an update on these efforts.

PO-68 : Factors Responsible for Students' Enrollment at Oklahoma State University
Parag Vilas Sasturkar, Oklahoma State University
Tuesday, 11:30 AM - 11:50 AM, Location: Griffin

It is crucial for a university to have bright students to become educational leaders in this increasingly competitive world. The Institutional Research and Information Management (IRIM) department at Oklahoma State University (OSU) has been playing a vital role in campus decision-making, managing institutional performance, and providing information, research, and analysis on demand. Therefore, it is very important for IRIM to collect and provide accurate information to market OSU i.e. target to the right audience (prospective students) every year. The main purpose of this research project is to help IRIM to evaluate different factors that have been driving an undergraduate student to enroll at OSU. This project will attempt to determine students who have a better chance or probability of choosing OSU over other universities. The dataset consists of students' data including demographics, admissions, and academic activities. This is a vast dataset containing approximately 45,000 students' records collected over last 2+ years with more than 15 suitable variables. This project will use SAS Enterprise Guide and SAS Enterprise Miner to conduct predictive analysis using methods like decision tree, logistic regression, and random forest to determine variables in the prediction of students' enrollment.