Contents

Dell Data Protection Search 1.1X Data Protection Deployment And Administration Guide PDF

1 of 126
1 of 126

Summary of Content for Dell Data Protection Search 1.1X Data Protection Deployment And Administration Guide PDF

Dell EMC Data Protection Search Version 1.1.x

Installation and Administration Guide 302-002-428

REV 06

Copyright 2015-2018 Dell Inc. or its subsidiaries. All rights reserved.

Published August 2018

Dell believes the information in this publication is accurate as of its publication date. The information is subject to change without notice.

THE INFORMATION IN THIS PUBLICATION IS PROVIDED AS-IS. DELL MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND

WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF

MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. USE, COPYING, AND DISTRIBUTION OF ANY DELL SOFTWARE DESCRIBED

IN THIS PUBLICATION REQUIRES AN APPLICABLE SOFTWARE LICENSE.

Dell, EMC, and other trademarks are trademarks of Dell Inc. or its subsidiaries. Other trademarks may be the property of their respective owners.

Published in the USA.

Dell EMC Hopkinton, Massachusetts 01748-9103 1-508-435-1000 In North America 1-866-464-7381 www.DellEMC.com

2 Data Protection Search 1.1.x Installation and Administration Guide

7

About Data Protection Search 11 About Data Protection Search.................................................................... 12

Architecture................................................................................... 12 Data Protection Search cluster components...............................................15

Data Protection Search licensing................................................... 15

Installation 17 Data Protection Search installation planning and considerations.................18 Virtual Appliance deployment preparation................................................... 21

Data Protection Search node component requirements................ 22 Deploying the Virtual Appliance on the vCenter server............................... 23

Deploying the vApp from a template..............................................24 Deploying the vApp from VMware vCloud Director........................26

Initializing the Data Protection Search environment .................................. 28 Basic configuration........................................................................30 Advanced configuration................................................................. 32 Installing Index Data components.................................................. 36 Installing Worker components .......................................................37 Installing additional Worker nodes................................................. 38

Installing a self-signed or trusted certificate...............................................38 Update LDAP configuration........................................................................39

Updating LDAP configuration in the Data Protection Search Admin UI...................................................................................................40 Updating LDAP configuration in the Data Protection Search Admin installation script............................................................................ 41 Updating the Data Protection Search Admin Group.......................42

Upgrading to the current release of Data Protection Search...................... 42

Administration 45 Data Protection Search Administration Web Application............................ 46 Logging in to the Data Protection Search Admin user interface ................ 46 Data Protection Search Admin UI home......................................................47 Data Protection Search dashboard.............................................................48

Sources 51 Sources...................................................................................................... 52 Add an Avamar server to Data Protection Search...................................... 52 Default Avamar server limit........................................................................ 54 Adding a NetWorker source server to Data Protection Search...................54 Connection Limitations considerations....................................................... 55 Updating an Avamar or NetWorker server..................................................56 Removing a server from Data Protection Search........................................56 Registering agents manually ...................................................................... 57

Preface

Chapter 1

Chapter 2

Chapter 3

Chapter 4

CONTENTS

Data Protection Search 1.1.x Installation and Administration Guide 3

Roles 59 About roles.................................................................................................60

Index Admin role............................................................................60 Data Protection Search Admin role............................................... 60 DPSearch Admin role.....................................................................60

Managing roles............................................................................................61 Add Index Admins...........................................................................61 Remove an Index Admin................................................................. 61

Indexes 63 Indexes view...............................................................................................64 Add an index...............................................................................................64 Edit an index...............................................................................................65 Remove an index........................................................................................ 65

Collections 67 Collection activities.................................................................................... 68

Add a collection activity.................................................................68 Add collection activity information................................................ 68 Add sources to a collection activity............................................... 68 Determine the collection activity scope.........................................69 Create a schedule for collection activities..................................... 69 View summary details for a collection activity............................... 69

Managing collection activities.....................................................................70 Edit a collection activity.................................................................70 View collection activities................................................................70 Remove a collection activity...........................................................71 Run a collection activity................................................................. 71 Enable collection activities............................................................. 71 Disable collection activities.............................................................71

Jobs 73 Jobs............................................................................................................74 Jobs views.................................................................................................. 75 Data Protection Search job types............................................................... 76 Job types and statuses............................................................................... 77 Monitor search jobs.................................................................................... 79 Monitor index jobs...................................................................................... 79 Monitor system jobs................................................................................... 80 View system health.................................................................................... 80 View system services.................................................................................. 81

System 83 Monitoring system health and services.......................................................84

View worker node health............................................................... 84 View CIS node health.....................................................................85

Options 87 Data Protection Search Options.................................................................88 Configuring Email notifications...................................................................89

Chapter 5

Chapter 6

Chapter 7

Chapter 8

Chapter 9

Chapter 10

CONTENTS

4 Data Protection Search 1.1.x Installation and Administration Guide

Performing Searches 91 About Searches.......................................................................................... 92 Access the Search window.........................................................................92 Optimize search performance.................................................................... 93 Optimize search results.............................................................................. 93

Change search filter...................................................................... 94 View search jobs............................................................................94 Narrow a search by index type...................................................... 95

Narrow a search by NetWorker or Avamar................................................. 95 Specify a platform......................................................................... 95 Specify a server.............................................................................95 Specify a client..............................................................................95

Narrow a search by file type.......................................................................96 Narrow a search by file attribute................................................................ 97

File name....................................................................................... 97 File size..........................................................................................98 File location................................................................................... 98

Narrow a search by date and time attributes..............................................98 Last modification date................................................................... 98 Backup date.................................................................................. 99

Include content that was not indexed in the search....................................99 Using keywords to search.......................................................................... 99

Perform a basic search.................................................................. 99 Perform an advanced search by using a Lucene query..................101

Restore files in Data Protection Search.................................................... 103 Search results........................................................................................... 104 Search criteria management..................................................................... 105 Search performance factors..................................................................... 108

Troubleshooting 109 The Data Protection Search log files......................................................... 110

Copying log files by using WinSCP................................................110 Copying log files by using PuTTy .................................................. 111

Viewing and filtering log files with Data Protection Search log viewer........111 Troubleshooting the Data Protection Search web server........................... 111

Data Protection Search configuration files................................... 112 Troubleshooting web services for collector issues..................................... 113 Increasing the maximum memory for the dpworker service....................... 115 Elasticsearch troubleshooting....................................................................115

Controlling the Elasticsearch service............................................ 115 Viewing Elasticsearch logs............................................................ 115 Viewing or changing Elasticsearch configuration.......................... 115 Monitoring the health of the Elasticsearch cluster........................116

117

Chapter 11

Chapter 12

Glossary

CONTENTS

Data Protection Search 1.1.x Installation and Administration Guide 5

CONTENTS

6 Data Protection Search 1.1.x Installation and Administration Guide

Preface

As part of an effort to improve product lines, periodic revisions of software and hardware are released. Therefore, all versions of the software or hardware currently in use might not support some functions that are described in this document. The product release notes provide the most up-to-date information on product features.

If a product does not function correctly or does not function as described in this document, contact a technical support professional.

Note

This document was accurate at publication time. To ensure that you are using the latest version of this document, go to the Support website at https:// support.emc.com.

Revision history This revision history provides a description for each revision of this Installation and Administration guide.

Table 1 Data Protection Search Revision History

Revision Date Changes

06 August 16, 2018 Updated the Virtual Appliance deployment preparation section in the Installation chapter.

05 May 30, 2017 Updated for Data Protection Search 1.1 SP3, including End User License Agreement (EULA) updates.

04 August 17, 2016 Added Default Avamar server limit section to the Sources chapter.

03 February 17, 2016 Editorial updates.

02 October 23, 2015 Modified the Data Protection Search Upgrade procedure in the Installation section of this guide.

01 September 23, 2015 Initial revision of the Data Protection Search Installation and Administration Guide.

Purpose This document describes how to install, configure and use Data Protection Search.

Audience This document is intended for the search administrator and index administrator who will be involved in managing Data Protection Search.

Data Protection Search 1.1.x Installation and Administration Guide 7

Related documentation The Data Protection Search documentation set includes the following publications:

l Data Protection Search Software Compatibility Guide

l Data Protection Search Security Configuration Guide

l Data Protection Search Installation and Administration Guide

l Data Protection Search Release Notes

Special notice conventions that are used in this document The following conventions are used for special notices:

NOTICE

Identifies content that warns of potential business or data loss.

Note

Contains information that is incidental, but not essential, to the topic.

Typographical conventions The following type style conventions are used in this document:

Table 2 Style conventions

Bold Used for interface elements that a user specifically selects or clicks, for example, names of buttons, fields, tab names, and menu paths. Also used for the name of a dialog box, page, pane, screen area with title, table label, and window.

Italic Used for full titles of publications that are referenced in text.

Monospace Used for:

l System code

l System output, such as an error message or script

l Pathnames, file names, file name extensions, prompts, and syntax

l Commands and options

Monospace italic Used for variables.

Monospace bold Used for user input.

[ ] Square brackets enclose optional values.

| Vertical line indicates alternate selections. The vertical line means or for the alternate selections.

{ } Braces enclose content that the user must specify, such as x, y, or z.

... Ellipses indicate non-essential information that is omitted from the example.

You can use the following resources to find more information about this product, obtain support, and provide feedback.

Where to find product documentation

l https://support.emc.com

Preface

8 Data Protection Search 1.1.x Installation and Administration Guide

l https://community.emc.com

Where to get support The Support website at https://support.emc.com provides access to licensing information, product documentation, advisories, and downloads, as well as how-to and troubleshooting information. This information may enable you to resolve a product issue before you contact Support.

To access a product specific Support page:

1. Go to https://support.emc.com/products.

2. In the Find a Product by Name box, type a product name, and then select the product from the list that appears.

3. Click the following button:

4. (Optional) To add the product to My Saved Products, in the product specific page, click Add to My Saved Products.

Knowledgebase The Knowledgebase contains applicable solutions that you can search for by solution number, for example, 123456, or by keyword.

To search the Knowledgebase:

1. Go to https://support.emc.com.

2. Click Advanced Search. The screen refreshes and filter options appear.

3. In the Search Support or Find Service Request by Number box, type a solution number or keywords.

4. (Optional) To limit the search to specific products, type a product name in the Scope by product box, and then select the product from the list that appears.

5. In the Scope by resource list box, select Knowledgebase. The Knowledgebase Advanced Search panel appears.

6. (Optional) Specify other filters or advanced options.

7. Click the following button:

Live chat To participate in a live interactive chat with a support agent:

1. Go to https://support.emc.com.

2. Click Chat with a Support Agent.

Service requests To obtain in-depth help from Support, submit a service request. To submit a service request:

1. Go to https://support.emc.com.

2. Click Create a Service Request.

Preface

Data Protection Search 1.1.x Installation and Administration Guide 9

Note

To create a service request, you must have a valid support agreement. Contact a sales representative for details about obtaining a valid support agreement or with questions about an account.

To review an open service request:

1. Go to https://support.emc.com.

2. Click Manage service requests.

Online communities Go to the Community Network at https://community.emc.com for peer contacts, conversations, and content on product support and solutions. Interactively engage online with customers, partners, and certified professionals for all products.

How to provide feedback Feedback helps to improve the accuracy, organization, and overall quality of publications. You can send feedback to DPAD.Doc.Feedback@emc.com.

Preface

10 Data Protection Search 1.1.x Installation and Administration Guide

CHAPTER 1

About Data Protection Search

This section includes the following topics:

l About Data Protection Search............................................................................ 12 l Data Protection Search cluster components...................................................... 15

About Data Protection Search 11

About Data Protection Search The Data Protection Search virtual appliance supports indexing and search backup data across one or more Avamar and NetWorker servers.

With Data Protection Search you can perform the following actions:

l Index and search for files by name, location, size, owner, file type, and date.

l Perform a targeted full content index (FCI) on search results. With the FCI feature, you can preview the content and search for keywords inside files.

l Perform advanced search queries including symbols, wildcards, filters, and operators.

l Preview, restore, and download search results.

l Review the size the selected files or directories.

Architecture Data Protection Search is a pre-installed, Linux-based virtual appliance. Data Protection Search requires additional configuration during deployment. It can be deployed as a single node, or as multiple nodes to form a fault-tolerant cluster.

The following figure illustrates the architectural components of Data Protection Search.

Figure 1 Data Protection Search architecture

Components The following table describes Data Protection Search components.

Table 3 Data Protection Search components

Component Description

Apache Tika An open source toolkit to extract full-text content and application-specific metadata from a wide variety of file types.

About Data Protection Search

12 Data Protection Search 1.1.x Installation and Administration Guide

Table 3 Data Protection Search components (continued)

Component Description

Collector Service A service that manages scheduled jobs to collect metadata and, optionally, the full contents of backup files. The collector service leverages connector interfaces to interact with backup platforms.

Common Index Service (CIS)

An abstraction layer above Elasticsearch that provides the ability for multiple applications to share the same Elasticsearch cluster, enabling cross-platform searches. CIS also provides a security layer above Elasticsearch to prevent unauthorized access.

Elasticsearch A highly scalable, high performance full-text index and search technology with built-in replication, capable of searching billions of objects within seconds. Elasticsearch leverages Apache Lucene for its indexes.

LDAP Authentication Server

Data Protection Search includes a built-in OpenLDAP authentication server with predefined users for administration and search operations. Additional Active Directory and OpenLDAP authentication servers can be added after configuration.

NGINX web server An open source reverse-proxy web server that hosts the web- facing components of Data Protection Search.

Unicorn An HTTP server for Rack applications, which are used by the CIS components. For CIS to work, the Unicorn service must be running on the Data Protection Search Index Master node.

Single-node or multi-node deployment Each Data Protection Search virtual appliance contains all the components that are required to provide index and search operations. Additional nodes can be added to form a Data Protection Search cluster.

Add extra nodes to:

l Improve the speed of indexing, monitoring, and search queries

l Store more indexed metadata and full content indexed content

l Provide replication

The following figure illustrates an example of a multi-node environment.

About Data Protection Search

Architecture 13

Figure 2 Multi-node environment

Replication A Data Protection Search cluster can include one or more nodes. Additional nodes improve search performance and provide the option for replication.

To enable data index replication, connect to the UI as a user with the Data Protection Search Administrator role, and then go to Administration > Options > Index Options. Toggle the Apply Replica settings to existing indexes switch to ON. By default, the Apply Replica settings to existing indexes option is disabled.

If more than one Data Protection Search node exists, system indexes are replicated automatically. Data indexes are only replicated when enabled.

Note

If multiple nodes exist, it is recommended that the Number of replicas option is not set to 0. If the Number of replicas option is not set, failover cannot occur.

About Data Protection Search

14 Data Protection Search 1.1.x Installation and Administration Guide

Data flow in Data Protection Search The following figures illustrates the flow of data in Data Protection Search.

Figure 3 Data Protection Search data flow

Data Protection Search cluster components You can configure a Data Protection Search cluster with multiple nodes or deploy all the components on a single node.

The Data Protection Search cluster environment includes the following components:

l Worker nodes (worker nodes can be restricted to either web server only, or collection service only)

l Index Master nodes (including Common Index Service)

l Index Data nodes

You can add Worker nodes to improve the speed of collections.

You can add Data nodes to improve the speed of indexing, to improve the speed of queries, to store indexed content, and to provide replication.

Data Protection Search licensing Data Protection Search does not require its own license, and is available as a part of the Data Protection Suite license.

About Data Protection Search

Data Protection Search cluster components 15

About Data Protection Search

16 Data Protection Search 1.1.x Installation and Administration Guide

CHAPTER 2

Installation

This section includes the following topics:

l Data Protection Search installation planning and considerations........................ 18 l Virtual Appliance deployment preparation...........................................................21 l Deploying the Virtual Appliance on the vCenter server.......................................23 l Initializing the Data Protection Search environment ..........................................28 l Installing a self-signed or trusted certificate...................................................... 38 l Update LDAP configuration................................................................................39 l Upgrading to the current release of Data Protection Search..............................42

Installation 17

Data Protection Search installation planning and considerations

Data Protection Search is a virtual appliance that can be deployed to VMware vCenter, or vCloud with one or more VMDKs.

How large is the environment When planning the Data Protection Search deployment, there are a number of factors to consider. The number of backup servers in the environment, and more importantly, the total number of clients to be indexed across the servers makes up the environment. The next factor to consider is how many backup records exist on the clients. For example, if a particular client contains 1 million files, and there are daily backups that are retained for 30 days, there may be 30 million backup records for that client.

What is being indexed? Data Protection Search only indexes traditional file system backups. The following backup types are ignored:

l Virtual machines

l Databases

l Microsoft modules

l Snapshots

l Block based backups

Data Protection Search provides flexibility for what is indexed from clients. It can be restricted to certain file types, and files modified within a certain date range.

The range of backup dates can be specified, either per client, or for all clients. Processing all historical backups takes longer than only indexing a short range of backups, or day-forward backups. By default, collection activities index backups less than a week old (before the collection) can be modified.

To keep the index current, it is recommended that collections are scheduled to run on the same schedule as backups. For example, daily or weekly.

Metadata or full content indexing? You can specify the type of indexing for each collection activity. The default indexing method is metadata only, and basic metadata is collected for each file:

l File name

l Path

l Size

l Date

l Server operating system

l Client

l File location (backup details)

Information is available from the Avamar or NetWorker databases, so processing finishes quickly, and backup storage access is not required. The amount of space that is required to store the record in the index is minimal.

Full content indexing restores full file data to the Worker node. The data is scanned, and all text strings are extracted and indexed. The text is searchable, and a preview of the file, either text-based or a thumbnail image is stored in the index. Full content

Installation

18 Data Protection Search 1.1.x Installation and Administration Guide

searches provide an enhanced search experience, at the expense of slower indexing, larger index size, and impact on backup storage performance.

It is recommended that metadata only is the primary search method.

How many Data Protection Search Appliances are required? It is possible to deploy Data Protection Search as a single, all-in-one node for both indexing, and search. If enough storage is available, an all-in-one node can handle billions of backup records. However, search performance is reduced as the number of records increases. Filtered searches can take a few seconds, or less, but broader searches and visual filters take longer.

Adding additional Index Data nodes provides better search performance. The computing, and memory requirement improvement is noticeable for concurrent searches distributed across all nodes.

Additional Data Protection Search Worker nodes enable collection activities to process faster, as the work items are distributed across all nodes. Add additional Worker nodes if there is a limited indexing window, and Data Protection Search cannot process all clients in that window.

Adding additional Worker nodes to process full content indexing for one Avamar server does not increase performance as the server is the bottleneck, not Data Protection Search. For example, an increase of 50%-70% indexing performance after one extra node, but only 40% for the next node, and decreases with every node added.

Note

The initial collection (index) takes longer than subsequent collections. Just as with full and incremental backups, Data Protection Search must initially process all files, and then process new or changed files. Indexing weeks, or months worth of historical backups takes longer than processing current daily backup collections.

Are replicas required? Elasticsearch can replicate, and fail over indexes automatically. For one Data Protection Search node without replication, changing the number of replicas to 0 prevents a yellow warning icon from displaying for the Elasticsearch cluster, and indexes on the dashboard.

Note

If there are multiple nodes, it is recommended that replica is not set to 0. If there is no replica, and fail over cannot occur.

There must be at least replicas plus one Index Master, or Data node in a Data Protection Search cluster. Additional Index Data nodes improve search performance for replication.

How much space is required for the indexes? The space that is required for indexes depends on a number of factors:

l Metadata only or full content indexing l For full content indexing, the size and type of files being indexed l For metadata indexing, the length of file names and pathnames l The amount of duplication (unchanged files appearing in multiple backups) l Replication

For metadata only, with 80% duplication, each backup record can use approximately 100 bytes to 200 bytes. With zero duplication (all unique files) which is unlikely, the average can be as large as 400 bytes to 600 bytes.

Installation

Data Protection Search installation planning and considerations 19

For full content indexing, the range is wider. Large multimedia files use very little space in the index, while small documents use significantly more. With a typical dataset at 80% duplication, each backup record can use 1 KB to 4 KB. With zero duplication, that average can be as large as 15 KB to 30 KB, or more.

Table 4 Disk use based on indexing

Backup records

1 billion 5 billion 10 billion 20 billion

Metadata only index

100 bytes 93 GB 466 GB 931 GB 1.9 TB

200 bytes 186 GB 931 GB 1.9 TB 3.7 TB

Full-content index

1 KB 854 GB 4.8 TB 9.5 TB 19 TB

4 KB 3.8 TB 12 TB 38 TB 76 TB

How large will the indexes grow over time? If an activity is setup to recur (daily), Data Protection Search continues to process new backups/save sets as they are created. New or modified items are added to the index, and there are references for items that are unchanged from previous backups.

When a backup expires, or is deleted, any items no longer in any backups for that client are removed from the index. Daily garbage collection, and a monthly reconciliation job removes expired, or deleted items.

The index size for a particular client initially grows until all current backups (for example, 30 days worth for monthly backup retention) are in the index. Then the index size stabilizes, and growth matches the data growth on that client. Adding more clients increases the size of the index.

View the size of the index in its properties. Indexes can be deleted to recover space.

How long does a search operation take? The Data Protection Search and Elasticsearch framework is immediate, with searches often taking less than a second. However, as the number of items in an index increases, search speed is often impacted as well, particularly on a single node. Generally, the broader the search is, the greater the impact of a large index is.

The broadest possible search is a wildcard (*) search across all indexes. Wildcard searches can complete in a few seconds even with hundreds of millions of indexed items, but as the item count increases to the billions, searches might time out before completing.

The recommendation is to use filters to narrow searches as much as possible. A wildcard (*) Search can take 30 s to 60 s in a single node environment with 15 billion backup records (1.5 billion unique files. A search that using type, and client keyword filters can take less than 1 s to complete. Since millions or billions of results are not useful, the use of filters is recommended.

Note

If indexing is running during the search, broad searches perform faster with a static index.

Installation

20 Data Protection Search 1.1.x Installation and Administration Guide

Visual filters require that matching items from the current search are aggregated, and requires large amounts of memory and time. Filters can narrow the results being aggregated to reduce the impact on memory and time. If a visual filter is not completed in a timely manner (10 s), the current results are displayed with a warning message indicating that results are incomplete.

Virtual Appliance deployment preparation Before beginning the Data Protection Search Virtual Appliance deployment, create, or collect the necessary groups, users and information.

Required information The following information is required for the deployment and configuration of Data Protection Search:

l Network:

n Hostname and IP

n DNS server

n Gateway

n Domain name

l LDAP:

n Hostname or IP

n Port

n Base domain name

n Username/password

n User account with the ability to query LDAP

n Data Protection Search Admin Group

l The following are the minimum virtual hardware memory requirements for the virtual machine:

n 32 GB RAM

n 2 CPUs

n 500-GB disk

n A dedicated Worker node, that is not configured for Elasticsearch needs the default 40-GB disk

Note

If Elasticsearch is being used (Index Master or Index Data node), it requires an additional disk that varies in size depending on the requirements.

LDAP Groups and Users Data Protection Search uses Active Directory for any LDAP services:

l CIS Admin to administer the Common Indexing Service (CIS), which is the security and multi-indexing layer that sits on top of Elasticsearch:

n Used to query LDAP

n It is helpful to create a user and password for the CIS Admin that does not expire

Installation

Virtual Appliance deployment preparation 21

l Create a Data Protection Search Admin Group:

n Active Directory For Active Directory, create a Global Security Group.

n OpenLDAP For OpenLDAP, create a Data Protection Search Admin Group with the object type GroupOfNames.

l Create at least one Admin user and add the user to the Data Protection Search Admin Group. A Data Protection Search Admin user can also be an Index Admin and a Search Admin (for evaluation)

l (Optional) Create the following:

n Index Admins (users and/or groups) Index Admins manage indexes, and manage and monitor collection activities and jobs. Index Admins receive notifications for those jobs they started.

n Search Admins (users and/or groups) Each index has a list of Search Admins (users/groups). Only Search Admins belonging to a specified index can search that index. Search Admins are specified Read only, or Full access permissions for a specified index.

Note

Members of the Data Protection Search Admin Group are also Index Admins by default.

Virtual appliance Data Protection Search is a virtual appliance that is composed of an OVF and a single vmdk.

Convert the deployed virtual machine to a template, and deploy the template as required for Data Protection Search nodes.

You can partially configure the template to simplify, and speed up future node deployments:

l Accept the EULA and answer the following question:

Is this appliance being deployed in China, Taiwan, Hong Kong, or Macau? y(es) or n(o) (Default: no)

l Setup common networking values, such as domain, DNS, and routing

l Setup the date/time zone

l Update/change passwords

Data Protection Search node component requirements There are several factors to consider when planning the Data Protection Search cluster.

The following table describes the Data Protection Search node requirements.

Table 5 Data Protection Search node requirements

Node CPU Memory Disk

All in one 4 + 32 GB+ Disk 1: 40-GB

Installation

22 Data Protection Search 1.1.x Installation and Administration Guide

Table 5 Data Protection Search node requirements (continued)

Node CPU Memory Disk

A second disk is required for nodes running as Elasticsearch data nodes, including all-in-one node. Disk 2 is mounted to a directory for Elasticsearch at deployment.

Worker 2+ 16 GB 40 GB

Index Master 2 + 32 GB Disk 1: 40-GB

An Index Master node also serves as an Index Data node. Therefore, a second disk is required. Disk 2 is mounted to a directory for Elasticsearch at deployment.

Index Data 2 + 32 GB Disk 1: 40-GB

A second disk is required for nodes running as Elasticsearch data nodes. Disk 2 is mounted to a directory for Elasticsearch at deployment.

Deploying the Virtual Appliance on the vCenter server The Data Protection Search Virtual Appliance can be deployed to VMware vCenter by following the wizard.

Before you begin

Review the following sections before deployment:

l Environment and system requirements

l Data Protection Search node component requirements

l Virtual Appliance deployment preparation

Procedure

1. From a vSphere client that connects to a VMware vCenter Server with ESX hosts, click File > Deploy OVF Template.

Note

A vCenter server is required to deploy the OVF.

2. Browse to the DPSearch.ovf file and click Next.

3. In the Name and Location window, specify a Name and an Inventory Location for the deployed template, and click Next.

4. Select a Host / Cluster on which to run the deployed template and click Next.

5. Select a Resource Pool for which to manage storage, and click Next.

6. In the Storage window, select a destination storage for the virtual machine files, and click Next.

7. Select the Disk Format in which to store the virtual disk:

Installation

Deploying the Virtual Appliance on the vCenter server 23

l Thick Provision Lazy Zeroed

l Thick Provision Eager Zeroed

l Thin Provision (recommended)

Note

Eager Zero yields the best performance, but also takes the most time to initialize. Thick provisioning does not fill the drive unless eager zeroed is selected. If thick provisioning is selected, the storage capacity for the entire virtual disk is allocated on the data store at virtual disk create time. Thin provisioning means that the capacity on the data store is allocated to the virtual disk as required, up to the full size of the virtual disk.

8. Click Next.

9. In the Network Mapping window, specify what networks the deployed template use, and click Next.

10. In the Ready to Complete window, verify that the options are correct and click Finish, or click Back to change options.

11. When the deployment completes successfully, right-click the newly deployed Virtual Machine and select Template > Convert to Template.

Converting the virtual machine to a template provides the ability to deploy multiple nodes.

12. When the conversion completes, continue to Deploying the virtual machine from a template.

Note

It is recommended that you rename the template to something intuitive, for example DPSearch.

Deploying the vApp from a template It is recommended that you deploy the virtual machine from the template to provide the ability to add additional nodes as required.

Procedure

1. Right-click the virtual machine and select Deploy Virtual Machine from this Template....

2. In the Name and Location window, specify a Name and an Inventory Location for this virtual machine, and click Next.

3. Select a Resource Pool within which to run the virtual machine and click Next.

4. In the Storage window, select a destination storage for the virtual machine files, and click Next.

5. In the Guest Customization window:

l Do not click the checkbox to Power on this virtual machine after creation.

l Select the Do not customize option.

l Click Next.

Installation

24 Data Protection Search 1.1.x Installation and Administration Guide

6. In the Ready to Complete window, verify that the options are correct and click Finish, or click Back to change options.

A virtual machine clone is created from the Template.

7. When virtual machine clone is creation completes, right-click the computer and select Edit Settings....

8. Continue to Customizing the Virtual Machine after deployment.

Add virtual disks on the vCenter Server If required, you can add virtual hard disks to provide storage for index data. Additional hard disks must be configured for all-in-one nodes and data nodes. Additional disks are not required for dedicated worker nodes.

Procedure

1. Need a starting point.

2. Right-click the virtual machine and then select Properties.

The virtual machine Properties window opens.

3. In the Properties window, on the Hardware tab, click Add.

4. In the Choose the type of device you wish to add list:

a. Select Hard Disk.

b. Click Next.

5. In the Select a disk window:

a. Select Create a new virtual disk.

b. Click Next.

6. In the Create a disk window, complete the following sections, and then click Next.

l Capacity (disk size)

l Disk Provisioning: Thin provision is recommended

l Location

7. In the Advanced Options window, click Next to accept the default settings.

The settings on this page do not typically change.

Configure the Virtual Machine on a vCenter Server You can configure the virtual machine after installation.

Note

Complete the virtual machine configuration immediately after installation. Changing the virtual machine settings later can make the virtual machine unstable.

Procedure

1. In the Virtual Machine Properties window, configure the following settings:

l Memory

l CPU

Installation

Deploying the vApp from a template 25

l Disk

2. Add and configure virtual disks on the vCenter Server.

Note

Add hard drives for the index data.

3. To open the console:

a. Select the newly deployed virtual machine in the vCenter server.

b. Select Power on from the list of commands.

Deploying the vApp from VMware vCloud Director You can deploy the virtual machine by using VMware vCloud Director is available. The following figure illustrates the VMware vCloud Director.

Figure 4 VMware vCloud Director

Procedure

1. Log in to the VMware vCloud Director.

2. To begin the deployment, click Add vApp from OVF.

3. Browse to the OVF file.

4. Complete the wizard, accepting the defaults with the exception of the Computer Name.

5. In the Configure Networking section of the Wizard, change the Computer Name, and click Next.

6. If required, modify the following components in the Customize Hardware section.

l CPU:

Installation

26 Data Protection Search 1.1.x Installation and Administration Guide

n Number of virtual CPUs

n Cores per socket

n Number of sockets

l Memory

l Hard Disks

7. Click Finish.

It can take some time for the new vApp creation to complete.

Adding virtual disks with vCloud If required, configure new virtual hard disks. Additional hard disks must be configured for all in one nodes and data nodes. Additional disks are not required for dedicated worker nodes. Additional disks provide storage for index data.

Procedure

1. Right-click the virtual machine and select Properties.

2. In the Hardware tab of the Virtual Machine Properties window, click Add in the Hard Disks section of the window.

Data Protection Search installation planning and considerations provides details on recommended disk size.

3. Accept the defaults for the remaining fields, and click OK.

Customizing the Virtual Machine in vCloud after deployment You can customize the virtual machine in vCloud after deployment.

Before you begin

Note

Complete the virtual machine configuration immediately after deployment as changing the virtual machine settings later can make the virtual machine unstable.

Procedure

1. In the Virtual Machine Properties window, configure Memory, CPU, and Disk. Data Protection Search node component requirements provides the specific recommendations for these settings.

2. In the Guest Customization tab, click to disable Allow local administrator password. If Allow local administrator password remains enabled, vCloud generates a random root password for the virtual machine.

If Allow local administrator password is disabled, the default password, Linux remains.

3. Add and configure additional disks as described in Adding virtual disks with vCloud.

Note

Add additional hard drives to hold index data.

4. To open the console, click the newly deployed virtual machine in the vCenter server, and choose Power on from the list of commands.

Installation

Deploying the vApp from VMware vCloud Director 27

Initializing the Data Protection Search environment Data Protection Search configuration and general operation is handled through a web- based Administration console. However, some settings must be configured by using a menu system in the Linux terminal. There are two methods of configuration, basic (recommended default settings that are excluded from wizard) and advanced. Accepting the End User License Agreement (EULA), and configuring network settings must be completed before the rest of the options are displayed:

l Accepting the End User License Agreement

l Configuring network settings

l Configuring an LDAP server

l Providing LDAP Admin user and group accounts

l Specifying the criteria for the indexing subsystem

l Setting the local time and time zone

l Updating system passwords

l Set the Puppet role for future upgrades

Procedure

1. Login to the Data Protection Search terminal:

Username: dpsearch

Password: dpsearch

Note

The default password is dpsearch. Change the password when possible for security reasons.

2. Type the following commands:

su

Type the root password: linux

Note

The default password is linux. Change the password when possible for security reasons. For versions earlier than Data Protection Search 1.1 SP3, enter cd download to change the directory with the install scripts.

3. Type the bash dp_install.sh command.

The option to Show EULA opens.

4. To display the EULA, type 1.

5. To accept the terms of the EULA, type yes.

6. To exit the EULA, type q.

7. Type 2 to initialize the environment.

The YaST2 menu opens to the YaST Control Center to configure network settings. Use the arrow keys, Tab, and Enter keys to browse and Alt-option to select items.

Installation

28 Data Protection Search 1.1.x Installation and Administration Guide

8. Tab to Network Devices and select Network settings from the list of devices.

The Network Settings window displays details of the current Data Protection Search host:

l Device name and type

l IP address

l Bus ID

9. Type Alt -I to edit the eth0 device.

10. Set the following network options if you are using a static IP, and type Alt -N or tab to Next:

l IP address

l Subnet mask

l Hostname

11. Tab to Hostname/DNS and type Alt -S to set the following options:

l Hostname

l Domain name

l Name servers

l Domain search

12. Tab to Routing, or Alt -u to set the Default gateway.

Note

A default gateway is required. If one is not set, the deployment scripts display errors during firewall configuration.

13. To complete the network options and exit the tool, select OK, or Alt -o YaST2.

If you change the hostname, a system restart is required for the change to take effect, type yes to restart.

14. Before continuing with installation on the Data Protection Search node, switch to the DNS server to configure the host lookup. Right-click the DNS/Active Directory/Open LDAP server, and select New Host:

l To add a record to resolve the hostname of the server to its IP address, type the node name in the Name field.

l Add the IP address defined in step 9.

l Ensure that nslookup returns the correct hostname/IP.

l For NetWorker, the host IP address must resolve to the same hostname defined for the NIC.

l Click Add Host.

15. When the Data Protection Search node restarts, complete the following:

l Log in with username: dpsearch l Password: dpsearch l Log in as root, su

Installation

Initializing the Data Protection Search environment 29

l Type the root password: linux l To verify that the Active Directory/Open LDAP server it is configured for

the Data Protection Search node, use the ping command

l Type the bash dp_install.sh command again to restart the installation process

Note

The default password is dpsearch, and linux for the root user. Change both passwords when possible for security reasons.

Basic configuration Both Advanced Configuration, and Basic Configuration options are available for Data Protection Search. Use the Advanced Configuration option if additional customization is required for the environment. To complete the recommended Basic Configuration, follow the instructions in this section.

Procedure

1. Login to the Data Protection Search terminal:

Username: dpsearch

Password: dpsearch

Note

The default password is dpsearch. Change the password when possible for security reasons.

2. Type the following commands:

su

Type the root password: linux

Note

The default password is linux. Change the password when possible for security reasons. For versions earlier than Data Protection Search 1.1 SP3, enter cd download to change the directory with the install scripts.

3. Type the bash dp_install.sh command.

The available setup options are shown in the following figure.

Installation

30 Data Protection Search 1.1.x Installation and Administration Guide

Figure 5 Basic Configuration options

4. Type 3 to select Basic Configuration, and display the options available for Basic Configuration.

5. Type 1 to install the first node (Index Master node), and respond to the following prompts:

l Please enter the Elasticsearch Cluster name (default: DPSearchCluster): Provide a unique cluster name

l LDAP Type (for example, AD or openldap, default: AD): l LDAP hostname (for example, ldap.domain.com):

Provide the hostname of the AD or OpenLDAP server

l LDAP port (for example, 389): Provide the port for the selected LDAP server. For AD, the default port is 389

l LDAP Base DN: (For example, dc=domain, dc=com): Provide the base domain for the LDAP server

l LDAP query username: Provide a username with rights to query the LDAP server (this user also administers CIS)

l LDAP query password: Provide the password for the username provided

The user credentials that are provided are validated, and the second disc drive is partitioned and mounted. Data Protection Search Admin Group name (A valid group in AD or OpenLDAP that contains Data Protection Search Administrators, such as the Data Protection Search Admin Group). Provide the name of the existing LDAP security group, for example, DPSearch Admin Group.

6. When the Data Protection Search Admin Group is confirmed, you are prompted to Press any key to continue....

The installation takes several minutes to complete.

Installation

Basic configuration 31

7. Type 0 for Back to last menu.

8. Type6 for Change date and time: and select the options when prompted:

l Please select a continent, ocean, "coord", or "TZ". The available continents, oceans, coord, and TZ options are displayed.

l Please select a country whose clocks agree with yours. The date and time zone for the choices are displayed.

l Type Y when prompted, Is the above information OK?, or make the required changes.

9. Type 7 for Puppet Configuration:

a. Type 1 for Configure as Puppet master.

b. Type 0 for Back to last menu.

c. Type 0 to exit the setup menu.

10. Restart, and change to the download directory after logging in to the terminal:

dpsearch

dpsearch

/home/dpsearch/

11. Run the following command to validate the installation:

bash validate_dpsearch_install.sh -a

12. Verify that you can launch the Data Protection Search web application and log in.

13. Add additional Index Data/Worker nodes as described in the following sections:

l Installing Index Data components

l Installing additional Worker nodes

Advanced configuration Both Advanced Configuration, and Basic Configuration options are available. Although Basic Configuration is recommended, use the Advanced Configuration option if additional customization is required for the environment.

Procedure

1. Log in to the Data Protection Search terminal:

Username: dpsearch

Password: dpsearch

Note

The default password is dpsearch. It is recommended that you change the password for security reasons.

2. Type the following commands:

su

Type the root password: linux

Installation

32 Data Protection Search 1.1.x Installation and Administration Guide

Note

The default password is linux. Change the password when possible for security reasons. For versions earlier than Data Protection Search 1.1 SP3, enter cd download to change the directory with the install scripts.

3. Type the bash dp_install.sh command.

4. Type 2 to initialize the environment.

To configure network settings, the YaST2 menu opens to the YaST Control Center. To browse, and select items, use the arrow, Tab, and Enter keys.

5. Tab to Network Devices and select Network settings from the list of devices.

The Network Settings window displays details of the current Data Protection Search host:

l Device name and type

l IP address

l Bus ID

6. Type Alt -I to edit the eth0 device.

7. Set the following network options if you are using a static IP, and type Alt -N, or tab to Next:

l IP address

l Subnet mask

l Hostname

8. Tab to Hostname/DNS and type Alt -S to set the following options:

l Hostname

l Domain name

l Name servers

l Domain search

9. Tab to Routing, and set the Default gateway.

Note

A default gateway is required. If one is not set, the deployment scripts display errors during firewall configuration.

10. Select OK, and exit the YaST2 tool.

If you change the hostname, a system restart is required for the change to take effect, type yes to restart.

11. Before continuing with installation on the Data Protection Search node, switch to the DNS server to configure the host lookup:

l Add a record to resolve the hostname of the server to its IP address

l Ensure that nslookup returns the correct hostname/IP

l For NetWorker, the host IP address must resolve to the same hostname defined for the NIC

Installation

Advanced configuration 33

12. When the Data Protection Search node restarts, complete the following:

l Log in with username: dpsearch l Password: dpsearch l Log in as root, su l Type the root password: linux l Type the bash dp_install.sh command again to restart the installation

process

Note

The default password is dpsearch, and linux for the root user. It is recommended that you change both passwords for security reasons.

13. Continue with the configuration process depending on the role you intend the Data Protection Search node to have:

l Single node (all in one)

n Follow the steps in Installing the Common Index Service (CIS).

n Follow the steps in Installing Worker components.

l Multi node

n Dedicated Index Master node (Do not configure more than one Index Master node). Follow the steps in Installing the Common Index Service (CIS).

n Dedicated Index Data node. Follow the steps in Installing Index Data components.

n Dedicated Worker node, perform either of the following:

Follow the steps to install Worker Components (first Worker Node).

Follow the steps in install Additional Worker Nodes (subsequent Worker Node).

Note

The CIS node must be installed as the first Data Protection Search node before any other nodes are installed. There can only be one Index Master node with these components.

Installing the Common Index Service (CIS) CIS must be installed on one node in the environment, with Data Protection Search on an all in one node or on a separate node in a multi-node environment. The steps in this section are required only for Data Protection Search Advanced configuration.

If the Advanced configuration option was selected from the dps_install.sh script, use the following procedure to continue with the CIS installation.

Note

The Basic configuration option provides default/recommended settings for the configuration options available in this section.

Installation

34 Data Protection Search 1.1.x Installation and Administration Guide

Procedure

1. Type 4 to launch Advanced configuration, then type 1 to configure CIS.

A submenu opens with the following options:

l Configure as Index Master Node (There can only be one CIS Master node)

l Configure as Index Data node (Many index data nodes can be added)

l Update LDAP Settings (used to update the LDAP query user password)

2. Respond to the following CIS installation prompts to configure the Elasticsearch nodes:

l Elasticsearch cluster Name Provide a unique cluster name

l Elasticsearch node name Accept the default

l Elasticsearch node heap size usually, choose half the physical memory of the node (usually this value is already the default)

l Number of index replicas The Number of index replicas (the default 1 is hard coded here) is used to determine how many replicas are made of each index. If you plan to install more than one index data node, you must set the replica number to >1, and lower than the (node number - 1). Change the number of replicas from the default of 0 in the Options section of the Admin UI.

Note

If there are not enough nodes to create the requested number of replicas, the status of the cluster and individual indexes remain yellow in the Data Protection Search Admin UI Dashboard.

l Number of shards Accept the default if you are not familiar with shard settings. The Elasticsearch website provides details on shards, and recommended settings.

3. Respond to the prompts to configure the LDAP settings:

l Select AD or OpenLDAP

l LDAP hostname name: Set to the name or IP of the OpenLDAP or Active Directory server (FQDN)

l LDAP Port: Port number (usually 389)

l LDAP Base DN: Base distinguished name of the domain or OU Data Protection Search uses when managing users and groups. For example:

n DC=domain, DC=com enables any users/groups in domain.com to be used

n OU=IT, DC=domain, DC=com enables any users/groups in the IT organizational unit of domain.com to be used

l LDAP Query Username: Account of a user that can query LDAP - Specify as username@domain.com (UPN for AD), or cn=username, dc=domain, dc=com (DN for OpenLDAP) This account is also the CIS Admin

Installation

Advanced configuration 35

l LDAP Login Password: Password for the account The LDAP settings are validated. If the authentication fails, you are prompted to re-enter the information.

4. Press a key when prompted, Press Any key to continue....

l The Elasticsearch data storage location is set and the second disk drive for the virtual machine (dev/sdb) is:

n Partitioned

n Formatted

n Mounted to /mnt/elasticsearch_data l Elasticsearch settings (yml) are updated

l Elasticsearch services are restarted

5. Type 0 to return to the main menu when the new window is displayed.

Installing Index Data components You can configure Index Data components by using the DPSearch & CIS Deployment script.

Before you begin

The Index Master node must already have been installed and configured.

Note

The Index Master automatically includes the Index Data components. It is not necessary to perform these steps on the Index Master node.

From the DPSearch & CIS Deployment script, you can install additional index data nodes.

Procedure

1. To configure CIS, type 1 to launch Advanced configuration, and then type 4.

2. Type 2 in the CIS Install menu:

l Existing DP machine name: Enter the name of the Index Master node (the first node installed)

l Elasticsearch Cluster name: Accept the default (unless changed for first node)

l Elasticsearch node name: Accept the default (name of this node)

l Elasticsearch node heap size: See nodes for the initial CIS install

l Number of Index Replicas: Same as the Index Master node

l Number of Index Shards: Same as the Index Master node

3. Repeat these steps to install additional index data nodes.

4. Verify that the index data nodes are successfully added in CIS nodes in the System section of the Admin UI.

The nodes are listed.

Installation

36 Data Protection Search 1.1.x Installation and Administration Guide

Installing Worker components Before you begin

The Index Master components must already have been installed and configured on either this node, or on another node. Procedure

1. Type 4 to begin Advanced configuration, then type 1 to configure Data Protection Search.

2. Configure the initial Data Protection Search node:

l Install directory: Accept the default although it can be modified.

l CIS URL: If CIS was installed on this node (Index Master), accept the default. If CIS was installed on another node, type the URL of that node. Use the same format as the default, but change the node name.

l CIS Admin: Type the username and password for the CIS Admin configured in Installing the Common Index Service (CIS)

l Data Protection Search Application name: Accept the default

l Worker node: Accept the default (yes), unless you are creating customized Worker nodes

l Web Services node: Leave as default (yes), unless you are creating customized Workers

l Data Protection Search Admin Group name: The name of a security group in LDAP

While the installation is validated, the following occurs:

l The CIS Admin is logged in to CIS

l The DPSearch Admin group is validated in LDAP

l You are prompted to Press any key to Continue...

3. Follow the prompt to Press any key to Continue.... The script will:

l Log in to CIS

l Create the DPSearch application

l Add the current node

l Setup system configuration

l Create the system index

l Initialize the system index mappings

l Configure the system index

l Create system activities

l Configure the worker

l Configure LDAP

The Data Protection Search & CIS Deployment script opens.

4. Type 0 to open to the main menu.

5. Type 6 to change the date & time.

Ensure that the date and time zone are the same as the backup servers.

Installation

Installing Worker components 37

6. Exit the installation script.

Results

You can now log in to the DPSearch Admin UI.

Installing additional Worker nodes Use the following procedure to install additional Worker nodes.

Before you begin

The initial Worker node and the Index Master node must be installed and configured.

Procedure

1. Type 4 to start Advanced Configuration.

2. Type 2 to Configure Data Protection Search.

3. Type 2 to Configure an additional Data Protection Search Node.

Consider the following:

Table 6 Prompt descriptions

Prompt Description

Install directory Can be modified. Modifying the install directory path is not recommended.

CIS URL Use the CIS URL for the Index Master node. Use the full URL as in the default.

CIS Admin Use the same username and password that is used for the Index Master node.

DPSearch Application name Use the default DPSearch Application name unless the name was changed on the initial node.

Worker node Use the default option unless you are creating customized Worker nodes.

Web Services node Use the default unless you are creating customized Worker nodes.

4. To install additional Worker nodes, repeat these steps.

Installing a self-signed or trusted certificate The NGINX web server that is provided with Data Protection Search is installed with a self-signed certificate, not a trusted public key certificate. The certificate is used for secure http access (https) to the web user interfaces, Admin, and Search REST APIs, and the Common Indexing System (CIS) REST API. The certificate includes secure communications between the components.

When a self-signed certificate is active, users connecting to the web-based Admin and Search interfaces are warned that they are connecting to an untrusted connection. For most web-browsers, this warning can be suppressed after it is initially displayed.

To install either a self-signed, or trusted certificate for the Data Protection Search NGINX web server, perform the following steps:

Installation

38 Data Protection Search 1.1.x Installation and Administration Guide

Procedure

1. Connect to the Data Protection Search node as root, and use the default password linux.

2. Copy the existing certificate and private key files to a backup location:

cp /etc/nginx/dpsearch.cert /BACKUP LOCATION

cp /etc/nginx/dpsearch.key /BACKUP LOCATION

3. (Optional) Generate a new private key:

openssl genrsa -out dpsearch.key 2048

4. Complete either of the following:

l Type the following command to create a self-signed certificate by using the existing or newly generated private key file: openssl req -new -x509 -key dpsearch.key -out dpsearch.cert - days 1095 Respond to the prompts.

l Type the following command to generate a certificate request (csr) file by using either the existing or newly generated private key file: openssl req -new -key dpsearch.key -out dpsearch.csr a. Respond to the prompts.

b. Send the dpsearch.csr file to the certificate authority.

c. Rename the returned certificate file to dpsearch.cert.

5. Stop the NGINX service:

service nginx stop

6. Copy the new certificate to the /etc/nginx directory. Optionally, copy the new private key to the /etc/nginx directory:

cp dpsearch.cert /etc/nginx/

cp dpesearch.key /etc/nginx/

7. Verify that the files have the correct permissions:

chmod 644 /etc/nginx/dpsearch.cert

chmod 644 /etc/nginx/dpsearch.key

8. Start the NGINX service:

service nginx start

Update LDAP configuration During deployment, the configuration of an LDAP server must be specified. At a later date, some of the specified settings must be updated. Change the password of the LDAP query account, or the account name itself.

There are two ways to update LDAP configuration as described in the following tasks.

Installation

Update LDAP configuration 39

Updating LDAP configuration in the Data Protection Search Admin UI During deployment, the configuration of an LDAP server must be specified. At a later date, some of the specified settings must be updated. Change the password of the LDAP query account, or the account name itself.

Procedure

1. Log in to the Data Protection Search Administration UI by selecting the following:

Note

You will have to log in to Data Protection Search Administration UI each time it is opened, or after an inactivity time out (1 hour by default).

2. Select Administration > Options > LDAP Options.

3. In the Host field, type the host name of the LDAP server.

4. In the Port field, type the port number that is used by the external authentication authority:

l For LDAP, the default port number is 389.

l For SSL, you can use port 636.

5. In the Base DN field, type the scope of the users and groups that are considered within the LDAP server.

For example:

DC=example, DC=com The Base DN determines the structure of the LDAP server where the search filter is applied. This is usually similar to the domain name over which the LDAP server has authority.

6. In the Admin User field:

a. Type a user account that has full read access to the LDAP directory, in the following format:

user@domain. For example, administrator@ldap.example.com

b. Ensure that the username is one of the following:

l Common name

l Email address

l Entry distinguished name

c. Ensure that the user has read access to the directory.

d. To include email notification, define the email address for the account.

Installation

40 Data Protection Search 1.1.x Installation and Administration Guide

Note

Only admin accounts with defined email addresses can receive email notifications. The default admin user is not configured with an email address and cannot receive email notifications.

7. In the Password field, type the password of the user account that you specified in the Admin User field.

8. In the SSL field, select either of the following options:

l To not apply secure connection settings, select Off. This option is the default setting.

l To connect to an external authentication server using LDAPS, select On.

The Verify Certificates field appears.

9. To verify certificates:

a. In the Verify Certificates field, select On.

b. Copy the PEM files to the appropriate directory:

l For Data Protection Search 1.1 SP3:

/etc/pki/trust/anchors/ l For Data Protection Search 1.1, 1.1 SP1, and 1.1 SP2:

/usr/share/ca-certificates/ l At the command prompt, type the following command:

update-ca-certificates

10. Click Validate.

Updating LDAP configuration in the Data Protection Search Admin installation script

If the LDAP query user password has been reset, it may not be possible to log in to the Data Protection Search Admin UI. The LDAP user is required to authenticate the login user against the directory service. If required, modify the LDAP configuration in a terminal session with an SSH tool such as PuTTy for the Data Protection Search Index Master virtual machine.

Procedure

1. Login as root (default password is linux).

2. Change to the install directory.

3. Note

For versions earlier than Data Protection Search 1.1 SP3, enter cd download to change the directory with the install scripts.

Run the dp_install.sh script:

bash dp_install.sh

4. Select option 4, Advanced Configuration.

Installation

Updating LDAP configuration in the Data Protection Search Admin installation script 41

5. Select option 1, Configure CIS.

6. Select option 3, Update LDAP settings.

7. To update the settings, follow the prompts.

Updating the Data Protection Search Admin Group If the LDAP domain changes, it might be necessary to change the Data Protection Search Admin Group.

Procedure

1. Open a terminal session for any Data Protection Search Worker node virtual machine using an SSH tool such as PuTTy. Log in as root with the default password linux.

2. Change to the /bin subdirectory of the installation directory, The default is /usr/local/dpsearch/bin.

3. Change the Data Protection Search Admin Group in the admin_cn field of ldap_settings.

4. Save the file, and exit the editor.

5. Run the following command:

ruby config_system.rb -o set_group -f config.json

Upgrading to the current release of Data Protection Search An upgrade .zip package (dpsearch-upgrade-build_number.zip) is available to download from the same location as the binaries.

Procedure

1. Download the .zip file and unzip it by using an appropriate program.

There are two directories:

l Puppet

Note

Puppet is an open source configuration management tool that provides the ability to deploy the update files from a single node to all nodes in a multi- node, Data Protection Search Cluster environment. Puppet must be installed on each node in the cluster only when upgrading from Data Protection Search 1.0.

l Upgrade

2. Using a secure FTP client such as WinSCP:

l Copy both the /upgrade and /puppet directories to the Data Protection Search Index Master Node.

l Copy the puppet directory to each node in the cluster.

3. In a Console window (SSH), browse to the /puppet subdirectory of the directory to which you copied the files, and log in as root user:

su

Password

Installation

42 Data Protection Search 1.1.x Installation and Administration Guide

4. Run the puppet install script:

bash config_puppet.sh

5. In the Data Protection Search Index Master Node, type y to confirm that it is the puppet master:

would you like to configure this node to work as puppet master? (y)es or (n)o:

The installation completes.

Note

If the deployment is an all-in-one environment, configure it as puppet master as described in step 5, and skip to step 7 directly to upgrade this all-in-one node.

6. For the remaining nodes in the cluster:

a. Browse to the /puppet subdirectory of the directory to which you copied the files, and log in as root user:

su

Password

b. Run the bash config_puppet.sh script.

c. Type n for the following prompt, to specify that it is a puppet agent:

would you like to configure this node to work as puppet master? (y)es or (n)o:

d. When prompted, provide the location for the puppet master on the Data Protection Search Index Master Node.

Note

Steps 46 are only used to configure the Puppet environment. It is necessary to complete steps 46 when upgrading from Data Protection Search1.0 to Data Protection Search 1.1. Steps 46 can be skipped for all subsequent upgrades. Also, to change the puppet role from a single node, manually delete the /etc/ puppet/puppet.conf file and re-run the bash config_puppet.sh script.

For a new Data Protection Search 1.1 installation, it is not required to configure the puppet master, and steps 46 are unnecessary. The Data Protection Search Admin can specify the puppet master and puppet agent roles from the installation menu.

Installation

Upgrading to the current release of Data Protection Search 43

Figure 6 Configuration script options

7. From the Data Protection Search Index Master Node, browse to the directory in which you copied the files (parent folder of the /puppet subdirectory).

To provide access permissions for the puppet files, run the following command:

chmod 777 -R upgrade/

chmod 777 -R puppet/

8. On the Data Protection Search Index Master Node, browse to the /upgrade subdirectory of the directory to which you copied the files, and run the upgrade script:

bash update.sh -o deploy

The upgrade is applied to all nodes in the cluster. There are many files so the upgrade can take some time to complete.

9. (Optional) Run a report to view the status of the upgrade, and to verify that the upgrade completed successfully for all nodes in the cluster:

bash update.sh -o report

The report (report_date_unique identifier.csv) is available in the /download/ upgrade/report directory.

10. Use a secure FTP client such as WinSCP or PuTTy to copy the report and view it in an application such as Microsoft Excel.

11. View the update logs from the/root/.dpsearch_update/log/ directory:

ls /root/.dpsearch_update/log -l

Note

The log files are useful if the upgrade fails for any nodes in the cluster.

Results

When you log in to Data Protection Search from the web browser, you see the login dialog box, and the dashboard for the current release of the product. In the Workers section of the Admin dashboard, the Workers are listed and the new build number is displayed in the respective details.

Installation

44 Data Protection Search 1.1.x Installation and Administration Guide

CHAPTER 3

Administration

This section includes the following topics:

l Data Protection Search Administration Web Application....................................46 l Logging in to the Data Protection Search Admin user interface ........................ 46 l Data Protection Search Admin UI home............................................................. 47 l Data Protection Search dashboard.................................................................... 48

Administration 45

Data Protection Search Administration Web Application The Data Protection Admin user interface provides the ability to administer, configure, and customize Data Protection Search.

When the Data Protection Search virtual appliance is deployed and the web server is configured and running, you can access the Data Protection Search Admin web app hosted by any Worker node.

Logging in to the Data Protection Search Admin user interface

Before you begin

To log in to the Admin UI initially, you must be a member of the Data Protection Search Admin group (LDAP user) configured during installation. Later, additional Index Admins can be created to log in and access the Admin UI. Cookies must be enabled in the browser.

After completing the Data Protection Search virtual appliance deployment, log in to the Admin UI.

Figure 7 Admin UI login

Active Directory and OpenLDAP are both configurable in /usr/local/ dpsearch/etc/ldap.conf, by default:

l The following are examples of supported Active Directory username formats:

n SamAccountName (administrator)

n User Principal Name (administrator@domain.com)

n DistinguishedName (cn=administrator, cn=users, dc=domain, dc=com)

n Windows NT account (domain\administrator)

n Mail (administrator@domain.com)

n cn (administrator)

Administration

46 Data Protection Search 1.1.x Installation and Administration Guide

l The following are examples of supported OpenLDAP username formats:

n cn (administrator)

n Mail (administrator@domain.com)

n entrydn (cn=administrator, cn=users, dc=domain, dc=com)

Procedure

1. Type the username in the User name field.

2. Type the password in the Password field.

3. Click Login.

At initial login, all options of the Data Protection Search Dashboard are available, as the DPSearch Admin user is also an Index Admin. Log in to the Data Protection Search UI each time it is opened, or after an inactivity time out (1 hour by default). The default inactivity timeout can be modified in the Options section of the Admin UI.

Data Protection Search Admin UI home The Data Protection Search Admin UI is customized based on role.

Figure 8 Admin UI dashboard

The following table lists and describes the sections of the Data Protection Search Admin UI.

Table 7 Data Protection Search Admin UI

Admin web UI tab Description Admin permission visibility

DPSearch Admin Index Admin

Dashboard View a summary of the health/status for the various Data Protection Search Components and drill down for more information.

Yes, with the exception of Indexes and Scheduled Collections.

Yes, can view Indexes, Scheduled Collections, and Notifications.

Sources Add, update, or remove Avamar and NetWorker servers.

Yes No

Administration

Data Protection Search Admin UI home 47

Table 7 Data Protection Search Admin UI (continued)

Admin web UI tab Description Admin permission visibility

DPSearch Admin Index Admin

Roles Add, update, or remove Index Admins. DP Search Admins are listed, but cannot be modified. Instead, they are managed with the LDAP solution.

Yes No

Indexes Add, update, or remove indexes. Manage permissions for each Search Admin

No Yes

Collections Schedule collections for Avamar and NetWorker backup servers.

No Yes

Systems Monitor Data Protection Search Worker and Index nodes.

Yes No

Jobs Lists running and completed activities and jobs. Details include type, status, duration, and more.

Yes Yes

Options Modify or enable the following:

l Search Options, the number of Search hits to display

l Session Options (timeout)

l LDAP Options

l Email notification Options (On or Off)

l Index Options (Including file types to exclude from Full Content Index, Number of replicas, and Apply replica settings to existing indexes)

Yes No

Help Access the Data Protection Search online help.

Yes Yes

Data Protection Search dashboard After logging in to Data Protection Search Admin UI, the full dashboard opens displaying a summary of the health/status of the components that make up Data Protection Search environment. Each section is color coded, and there is a section to display the number of healthy (green), warning (yellow), and error (red) items for each component.

To view a detailed list and additional information, click the expand/collapse arrow to the right of each dashboard component name. For Scheduled Collections and Notifications, click More... to view a detailed list and additional information. To force a data refresh, there is a refresh icon for each component.

Administration

48 Data Protection Search 1.1.x Installation and Administration Guide

For Servers, a Last Updated Time field provides a timestamp for unresponsive servers. If the Last Updated Time value remains the same after approximately 1 hour, it is listed in a red status bar, and a message similar to the following is displayed:

One or more Servers are unresponsive

For Workers, a Last Heartbeat Time field provides a timestamp for unresponsive workers. If the Last Updated Time value remains the same after approximately 10 minutes, it is listed in a red status bar, and a message similar to the following is displayed:

One or more Worker services are unresponsive

Figure 9 Data Protection Search Admin dashboard

The following table lists the dashboard components and visibility based on Admin permissions.

Table 8 DP Search dashboard

Health/ Status for component

Description Admin permissions visibility

DP Search Admin

Index Admin

Both DP Search

and Index Admin

Servers Lists up to 5 configured backup servers with information on platform, version, and status. From the bottom of the expanded list of servers, click

More... to open the Sources section of the Admin UI to view additional information. The last time the status of each backup server was updated is displayed. To refresh the server status, click the refresh icon.

Yes No Yes

Administration

Data Protection Search dashboard 49

Table 8 DP Search dashboard (continued)

Health/ Status for component

Description Admin permissions visibility

DP Search Admin

Index Admin

Both DP Search

and Index Admin

Workers Lists up to 5 configured Data Protection Search Worker nodes and their status. To see a last updated time, and view unresponsive worker

services, click More... at the bottom of the list of expanded Worker nodes.

The System section of the UI opens.

Yes No Yes

Clusters Lists up to 5 configured Elasticsearch

Cluster status. More...To open the

System section of the DPSearch UI,

click More...at the bottom of the list.

Yes No Yes

Indexes Lists up to 5 configured indexes and their status. At initial login, no indexes have been created so the list is empty.

More...To open the Indexes section

of the DPSearch UI, click More... at the bottom of the list.

No Yes Yes

Scheduled Collections

Lists up to 5 upcoming scheduled collections. To view the full list of

configured collections, click More... at the bottom of the list.

No Yes Yes

Notification Displays up to 5 recent notifications. To view and manage notifications,

click More.... Configure email

notifications in Email Notification Options in the Options section of

the Admin UI. You can open System Notifications, mark them as read,

unread, or view all notifications.

Yes Yes Yes

Administration

50 Data Protection Search 1.1.x Installation and Administration Guide

CHAPTER 4

Sources

This section includes the following topics:

l Sources..............................................................................................................52 l Add an Avamar server to Data Protection Search.............................................. 52 l Default Avamar server limit................................................................................ 54 l Adding a NetWorker source server to Data Protection Search.......................... 54 l Connection Limitations considerations...............................................................55 l Updating an Avamar or NetWorker server......................................................... 56 l Removing a server from Data Protection Search............................................... 56 l Registering agents manually ..............................................................................57

Sources 51

Sources Add, update, and remove Avamar and NetWorker servers in the Sources section of the Data Protection Search Admin UI. At initial login, the Data Protection Search Admin is prompted to add sources.

Note

A check mark in the server product icon (Avamar or NetWorker) confirms that there is a connection to the respective server. If Data Protection Search is unable to connect to the server, a warning icon is displayed.

You can also use Find by name... to find a server from the list.

Figure 10 Sources

Add an Avamar server to Data Protection Search You can add Avamar servers to Data Protection Search.

Procedure

1. In the Sources section of the UI, click the add icon ( ).

The Add Source window opens.

2. In the Add Source window, complete the required fields as listed in the following table.

Sources

52 Data Protection Search 1.1.x Installation and Administration Guide

Table 9 Required felds

Required fields Description

Name Type the source hostname or IP of the Avamar server.

Platform Select Avamar from the drop down list.

Port Default is port 9443.

User ID Can be an existing Administrator user, an Avamar Management Console user, or a Data Protection Search user.

Note

You can choose an existing Avamar Admin user for that server or create your own in the Avamar Administrator.

Password Provide the password for the corresponding

user as described in User ID.

Confirm Password Re-enter the password.

Timezone Choose a timezone for the current Avamar server from the drop down list.

Connection Limitation Click the check box to enable a Connection Limitation.

Note

Set a limitation, or leave the fields (Indexing, Restore and Download) blank for unlimited connections. The Connection Limitation is disabled by default.

Enable blackout window Click the check box to enable and select the specified time range each day that no collection jobs can run for that backup server.

Enable only for full content indexing Click the check box to restrict full-content collection activities during the blackout window. If checked, metadata only collections can run at any time.

3. Click Add.

The new server is now available in the list of servers in Sources. When a server is added, a system job is added to register the Avamar client on each Data Protection Search Worker node to the new server. This can be monitored in Jobs by clicking the System Jobs checkbox.

Note

When the Avamar domain/clients is missing, adding an Avamar server as a source in Data Protection Search is not possible.

Sources

Add an Avamar server to Data Protection Search 53

Default Avamar server limit Data Protection Search can index multiple Avamar servers simultaneously. Data Protection Search registers multiple Avamar File System clients on each Data Protection Search node. Each client is configured to use a different Transmission Control Protocol (TCP) port to communicate with the server, beginning at 28002.

By default, for Avamar 7.1 and later, the Avamar server only exposes ports through the firewall ranging from 28002 to 28011. Once a port above 28011 is used, the server is unable to connect to the client.

Note

If more than 10 Avamar servers are configured in Data Protection Search, the server is unable to connect to the client. Removing and re-adding a server does not work.

Adding an Avamar server to Data Protection Search succeeds. However, full content indexing or download operations do not complete. Browsing Data Protection Search nodes in the Avamar Administration GUI result in a browse timeout. Client details display a page port of 28012 or higher.

To workaround this issue, the port range must be increased for additional servers. The /etc/firewall.default file on the Avamar server must be edited. The exec_rule -A OUTPUT -p tcp -dport 28001:28011 -j ACCEPT line must be modified to include a wider range. For example, exec_rule -A OUTPUT -p tcp -dport 28001:28200 -j ACCEPT.

Note

Once a port range is expanded on the Avamar server, existing Data Protection Search operations may not complete. If the operation does not complete, cancel and then re- run the operation.

Adding a NetWorker source server to Data Protection Search

Procedure

1. In the Sources section of the UI, click Add.

The Add Source window opens.

2. In the Add Source window, complete the required fields as listed in the following table.

Table 10 Required fields

Required fields Description

Name Type the source hostname or IP of the NetWorker server.

Platform Select Avamar from the drop-down list.

Time zone Choose a time zone for the current NetWorker server from the drop-down list.

Sources

54 Data Protection Search 1.1.x Installation and Administration Guide

Table 10 Required fields (continued)

Required fields Description

Connection Limitation To enable a Connection Limitation, click the checkbox.

Note

Set a limitation, or leave the fields (Indexing, Restore, and Download) blank for unlimited connections. The Connection Limitation is disabled by default.

Enable blackout window To enable, and select the specified time range that daily collection jobs do not run, click the checkbox.

Enable only for full content indexing

To restrict full content collection activities during the blackout window, click the checkbox. If restricted, metadata only collections run.

Note

Configure all Data Protection Search nodes as a client of the NetWorker server before adding it.

3. Click Add.

The server is now listed in Sources. When a server is added, a system job is added for each Data Protection Search Worker node, and the NetWorker registration process completes. The registration can be monitored in Jobs by clicking the System Jobs checkbox.

Note

Data Protection Search does not recognize a change in retention that is made on the backup server for 30 days.

Connection Limitations considerations Connection Limitations provide the ability to limit to the number of concurrent connections available for Data Protection Search to the backup server. Limiting the number of concurrent connections to the server leaves resources for necessary operations such as scheduled back ups.

Set limitations in the following three fields:

l Indexing

l Restore

l Download

Leaving Connection Limitations fields blank permits unlimited connections between Data Protection Search and the backup server.

Indexing Indexing means connections that are made to index the metadata or full content of files, as part of a collection activity. If you have multiple Worker nodes, one collection activity can create many connections. Set a connection value to control the Worker nodes.

Sources

Connection Limitations considerations 55

Restore Restore means restore operations that are manually triggered from the Search application, by Search Admins.

Download Download means download operations that are manually triggered from the search app. Downloads only include one file at a time, so that may be a lower priority to limit, as compared to indexing, for example, which may generate 100s of requests.

Impact of Connection Limitations

l Limiting indexing connections results in slower Data Protection Search indexing.

l Limiting restore or download connections results in longer response times for restore or download requests for Search Admins.

Updating an Avamar or NetWorker server You can update the configuration of an Avamar or NetWorker server that was previously added to Data Protection Search.

Click the Avamar or NetWorker server, and click the edit icon ( ). In the Update window, perform one of the following:

l For an Avamar server, complete the following fields:

n Name

n Port

n User ID

n Password

n Time zone

n Connection Limitation

n Blackout window (the update takes effect at the next runtime)

Note

You can choose an existing Avamar Admin user for the server, or create a user in the Avamar Administrator.

l For a NetWorker server, the following fields can be updated:

n Time zone

n Connection Limitation

n Blackout window

Removing a server from Data Protection Search Use the following procedure to remove Avamar and NetWorker servers from Data Protection Search.

When a backup server is removed, any item that is indexed from that server remains in the index until the backups expire. However, it is not possible to download or restore these items. Indexed items cannot be restored, even if the backup server is re-added. The re-added server has a different internal identifier. It is recommended that associated indexes are removed for deleted servers.

Sources

56 Data Protection Search 1.1.x Installation and Administration Guide

Procedure

1. Select the server from the list and click the Delete server icon ( ).

2. A message similar to the following is displayed:

Selected item(s) will be permanently deleted. Please type DELETE to confirm.

Note

If there are outstanding collection activities that are defined, a server cannot be removed. Delete the collections first. Also, a source cannot be removed If registering jobs running on a source, it cannot be removed. For example, when the server status is initializing, or after manually creating registering jobs.

3. Type DELETE, and click Confirm.

Results

The server is no longer listed.

Registering agents manually If a source failed to initialize, or the message, Some agents failed to register displays, some of the agents did not register correctly and might not work. Incomplete configuration caused by network connectivity, firewall issues, and so on, cause registration failure. If required, you can manually trigger registering jobs for a server to correct the problem.

Procedure

1. Select the server from the list, and click the Register icon.

A message similar to the following is displayed: Successfully created jobs to register backup server agents. The status of the jobs can be monitored on the Jobs page.

2. Click OK to close the message.

Note

If registering jobs are running on the server, the Register icon is disabled.

3. Monitor the registering job status in Jobs > View system jobs or from the Sources window, refresh the list of sources.

Sources

Registering agents manually 57

Sources

58 Data Protection Search 1.1.x Installation and Administration Guide

CHAPTER 5

Roles

This section includes the following topics:

l About roles.........................................................................................................60 l Managing roles................................................................................................... 61

Roles 59

About roles A role defines the privileges and permissions for users to perform a group of tasks.

When you configure the DPSearch virtual appliance, there are already predefined users from OpenLDAP.

Index Admin role Users with the Index Admin role have permissions to perform the following tasks:

l Create metadata only indexing collection activities (default)

l Create metadata only and full content indexing collection activities (must specifically enable full content indexing capability)

l Create and maintain indexes

l Monitor index jobs

l Receive index jobs related to notifications

Data Protection Search Admin role If required, Data Protection Search provides the ability to create multiple indexes, and to specify the users and/or groups with access those indexes. These users/groups are referred to as Search Admins. When a Search Admin logs in to the Search UI, they can search only those indexes to which they have access.

The following table lists the DPSearch Admin roles.

Table 11 Admin roles

Search Admin role Description

Index Admin - All access No restrictions are applied.

Index Admin - Read only Cannot view inline or full content preview for search hits, download files locally, or restore files to an alternate location.

Note

The Data Protection Search Admin Group is the default Index Admin. Members of the Data Protection Search Admin Group are listed and cannot be edited directly. DPAdmin users are added and modified in any LDAP based directory service, such as Active Directory.

DPSearch Admin role Users with the DPSearch Admin role can perform the following tasks:

l Manage backup servers

l Manage roles

l Monitor system

l Monitor system jobs

Roles

60 Data Protection Search 1.1.x Installation and Administration Guide

l Set options

l Receive notifications

Managing roles You can add and edit user roles and assign access privileges to administrators.

Add Index Admins Procedure

1. In the Indexes > Index Admin window, click the add icon:

2. In the Select User window:

a. Type a username or a substring.

b. Display the list of Active Directory, or OpenLDAP users, by clicking Find.

c. Restrict the search to Users only, Groups only, or both (the default).

Note

If you click Find before specifying a substring, the entire directory is returned, which can be slow. For example, to search for all users that contain Admin, type Admin, and click Find.

3. Select the user or group to add and enable Metadata Index only (default) or Metadata and Full content Index.

Note

Full content index searches can take longer than Metadata index searches and put a substantial strain on the backup server and backup storage performance.

Remove an Index Admin Use the following procedure to remove Index Admins from Data Protection Search.

Before you begin

You must have full Data Protection Search Admin privileges to remove Index Admin users or groups from Data Protection Search.

Procedure

1. In the Indexes > Index Admin window, click the Delete icon ( ).

2. To remove users or groups, select the checkboxes, and click Remove.

Results

The users or groups are no longer listed.

Roles

Managing roles 61

Roles

62 Data Protection Search 1.1.x Installation and Administration Guide

CHAPTER 6

Indexes

This section includes the following topics:

l Indexes view...................................................................................................... 64 l Add an index...................................................................................................... 64 l Edit an index...................................................................................................... 65 l Remove an index................................................................................................65

Indexes 63

Indexes view Indexes hold the indexed metadata and/or content extracted from backup files. Data Protection Search provides the ability to create multiple indexes, and to specify the particular users and/or groups able to access those indexes. These users/groups are referred to as Search Admins. When a Search Admin logs in to the Search UI, they can search only those indexes to which they have access.

All configured indexes are listed in the Indexes section of the UI. For each index, the following information is displayed:

l The name and description of the index

l The size and number of items in the index

l Information that is provided by Elasticsearch:

n The number of items in the index is not exact, since additional records for backups/save sets are also stored in the index

n The date/time the index was created, and the last modified date/time

Add an index Use the following procedure to add an index.

Procedure

1. Click Administration > Indexes.

2. In the Indexes window, click .

3. In the Add Index window, complete the following fields:

l Index name

l Description

l Analyzer

l User/group

When an index is created, the user that logged in is added as the default Search Admin for that index, and is assigned an AdminAll Access role. If required, this user can be removed, or the role can be changed.

4. Additional or replaced users and/or groups can be added. To add a user/group for the index, in the Users/Groups section, click .

a. In the Select user window, type the name of the user or group to add, and then click Find.

b. To select the user/group in the search results, select the row, and then click OK.

The user or group is now listed in the Users/Groups section, and is assigned an AdminRead only role by default.

c. To change from AdminRead only for the user or group, in the Roles column, click AdminAll access.

5. In the Analyzer field, specify one of the following analyzer options:

l Standard (recommended)

Indexes

64 Data Protection Search 1.1.x Installation and Administration Guide

l Simple

l Whitespace

l Languages

6. Click Save.

Results

The new index appears in the list.

Edit an index Use the following procedure to edit existing indexes.

Procedure

1. Click Administration > Indexes.

The list of indexes appears.

2. To edit an index, select it, and then click .

The Manage Search Roles page appears and displays the following index details:

l Index name

l Index description

l User/Group

You can add to, or remove users or groups from the index. Apply or change the following permissions for a specified user or group:

l Admin: Read only

l Admin: All access

3. Make the required changes, and then click Update.

Remove an index Use the following procedure to remove an index.

Procedure

1. Click Administration > Indexes.

The list of indexes appears.

2. To remove an index, select it, and then click .

3. When prompted, type DELETE, and then click Confirm.

A message similar to the following appears:

If you delete this index, its contents will be permanently lost. Type DELETE to confirm

Results

The index no longer appears in the list.

Indexes

Edit an index 65

Indexes

66 Data Protection Search 1.1.x Installation and Administration Guide

CHAPTER 7

Collections

This section includes the following topics:

l Collection activities............................................................................................68 l Managing collection activities............................................................................ 70

Collections 67

Collection activities Collection activities are used to identify the backup clients to be indexed. The rules that define how indexing is applied, for example, the time and duration of the indexing.

Add a collection activity Add a collection activity to select Avamar and NetWorker backup clients to index.

Procedure

1. Click Administration > Collections.

2. In the Collections window, click .

The Activity Information page appears.

Add collection activity information On the Activity Information page, specify the following information:

Procedure

1. In the Name field, type a name of the collection activity.

2. In the Description field, type a description of the collection activity.

3. In the Index field, select an index from the list.

4. In the Specify information to index field, select an option to:

l Document metadata only

l Full content index, including metadata and text

5. Click Save & Next.

The Sources page appears.

Add sources to a collection activity On the Sources page, specify the following information.

Procedure

1. To display the list of backup servers, click .

2. To display the list of available clients, select one or more Avamar or NetWorker backup servers.

You can search or filter on the Backup Server or Backup Clients list.

3. Select one or more backup clients from the list, and then click Add.

If required, click Refresh to update the list of clients from the backup server.

4. Click OK.

Note

A single collection can have clients from multiple servers, and from both Avamar servers and NetWorker servers.

5. Click Save & Next.

Collections

68 Data Protection Search 1.1.x Installation and Administration Guide

The Scope page appears.

Determine the collection activity scope On the Scope page, specify the following information.

Procedure

1. Complete the following fields:

l In the Activity Index Type field, the index type appears.

l In the Content Filter field, select an option to:

n Index all documents (*.*) (default)

n Index only the specific documents

2. Click Save & Next.

The Schedule page appears.

Create a schedule for collection activities On the Schedule page, specify the following information.

You can define the time, duration, and the collections recurrence schedule for indexing. Configure collection activities to recur daily or weekly to match the backup schedule. Matching the backup schedule ensures that new backups are processed for indexing when they complete. Schedule the indexing window to occur after the backup window.

Procedure

1. In the Reccurence pattern field, select the schedule for indexing.

2. In the Start time field, select the time at which you want to begin indexing. The option ASAP is enabled by default.

The option ASAP is enabled by default.

3. Select a Start date.

4. In the Duration field, select a duration for the indexing to occur.

5. In the Range of occurrence field, set the end date to stop indexing.

6. Click Save & Next.

A list of Collection Activities displays.

View summary details for a collection activity Before you finish creating the collection activity, you can view the status and summary information.

Procedure

1. To ensure that the details are correct, review the information.

2. To finish creating the collection activity, click Finish.

The collection activity status changes to Pending.

Collections

Determine the collection activity scope 69

Managing collection activities The following sections include information about managing collection activities.

For a collection activity, you can perform the following actions:

l Edit

l View

l Remove

l Run

l Enable

l Disable

Edit a collection activity You can edit an existing collection activity.

Procedure

1. In the Collections window, click the activity name of the activity you want to edit.

The Activity Summary page appears and displays the activity details.

2. To edit the collection activity, click .

The Activity Information page appears.

3. Edit the values in the following fields as required:

l Name

l Description

l Index

l Specify information to index

Note

You cannot edit a completed collection activity.

4. Click Save & Next.

View collection activities You can view collection activity details for all completed jobs or jobs that are currently running.

To view a summary of the configuration settings for that activity, click the activity name. The Activity Summary page appears and displays the activity details.

To view activity details for a job, in the Jobs field, click Details. The list of jobs appears and displays their respective details.

You can show or hide the taskbar, and filter the activities by Name and Status. To modify the filter for activities, click the icons to filter, select all, and refresh, or click Reset filter.

Collections

70 Data Protection Search 1.1.x Installation and Administration Guide

Remove a collection activity To remove a collection activity, perform the following steps.

Procedure

1. Select one or more activities from the list, and then click .

A confirmation message appears.

2. Click Confirm.

Run a collection activity To begin indexing for a collection activity, run the collection activity.

To run the collection activity, select the activity from the list, and then click .

The activity begins processing the backup for indexing and displays the time at which you ran the activity.

Enable collection activities To run a collection activity based on the schedule or to force a collection activity to run now, enable one or more collection activities.

Select one or more activities from the list, and then from the taskbar, click the Enable icon.

Disable collection activities To prevent a collection activity from running, disable one or more collection activities.

Select one or more activities from the list, and then from the taskbar, click the Disable icon.

Note

If you disable a collection activity, it will not start regardless of the schedule.

Collections

Remove a collection activity 71

Collections

72 Data Protection Search 1.1.x Installation and Administration Guide

CHAPTER 8

Jobs

This section includes the following topics:

l Jobs................................................................................................................... 74 l Jobs views..........................................................................................................75 l Data Protection Search job types.......................................................................76 l Job types and statuses.......................................................................................77 l Monitor search jobs............................................................................................79 l Monitor index jobs..............................................................................................79 l Monitor system jobs...........................................................................................80 l View system health............................................................................................ 80 l View system services..........................................................................................81

Jobs 73

Jobs In the Jobs section of the Data Protection Search UI, you can view complete details and status for Data Protection Search collection, background, and system maintenance jobs.

Index Admins create collection activities, which can be recurring. Every time a collection activity runs, a job is created. One collection activity can have many jobs.

Background system jobs monitor the system components, refresh client lists and backups, and run garbage collection to remove expired backups from the index. Jobs can be triggered from the Search UI, including Restore, Download, and Query.

Figure 11 Jobs UI

By default, the collection, restore/download, and long query jobs are in the jobs list. System jobs are hidden to keep the list free from unnecessary information, and only the most important job types display.

System jobs can be added to the list by clicking the Show system jobs icon on the toolbar, or by enabling System in the Jobs filter window.

Index Admins can view only Index jobs (both metadata only and full-content).

Available options for Jobs Actions available from the taskbar on the right side of the Jobs UI, are listed in the following table.

Jobs

74 Data Protection Search 1.1.x Installation and Administration Guide

Table 12 Available actions for Jobs

Available action Description

Stop ( ) Stop long running, or very large jobs such as full-content index that sends many requests. It stops any further requests from being sent to the Source.

Note

You can stop job requests that are in the queue. However, it is not possible to stop requests that are already being processed on the backup server.

Filter ( ) You can filter on job type, status, the user who triggered the job, and activity name. Any job filters enabled here are active for the duration of the current session. If you return to the window and filters are enabled, the string Filtered results. Click the filter icon in the taskbar to remove or change the filter. is displayed above the list.

Show system jobs

( )

Use Reset filter to return to a full view of all jobs.

Refresh ( ) Refresh the list of jobs and their status.

Jobs views Jobs status has an icon overlay when you hover over any job in the list.

Table 13 Jobs views based on Admin permissions

Job type Admin permissions visibility

DPSearch Admin

Index Admin

Index (both metadata only and full content) No Yes

Restore Yes No

System maintenance (includes garbage collection, source cache, and node status)

Yes No

Query Yes No

The Jobs view is filtered by using the following criteria:

l Job type l Job status l Triggered by (user) l Activity name

The Reset icon in the taskbar, resets the filter to the default (show all jobs).

Tasks Each job is broken down into one or more subtasks, which are more granular work items that reflect one portion of the job. For example, when adding a backup server, a

Jobs

Jobs views 75

job is created to register agents for that backup server. Since every Data Protection Search Worker node must register its own agent, the job results in a separate task for each Data Protection Search Worker. Similarly, when a collection activity job runs, there is a separate task for each backup/save set on every indexed client. For full content indexing, there can be more than one task for large backups.

It is useful to have access to the more granular viewpoint that tasks offer, particularly for failed jobs, or even just to understand how much of a long-running job has completed so far. Therefore every entry in the Job list has a View Tasks link to view the lists of tasks for that job. Each entry in the task list includes a status and details specific to the type of task. For example, a collection job shows details of the backup set, backup client, and backup server. Statistics are shown, indicating how many items have been scanned, processed, succeeded, failed, or are duplicates or updates. And the Worker that processed the task is identified, which can help in finding the correct log file to troubleshoot an issue.

If a task failed, hovering over the icon shows an error message indicating the reason for the failure. A toolbar to the right of the list provides the option to refresh the list, or to return to the jobs list. It is also possible to filter and/or sort the list of jobs.

Data Protection Search job types The Jobs section of the Admin UI provides information and status for indexing, search related, and system jobs.

Indexing jobs When a collection activity is created, there are a number of scheduling options available. The collection can be scheduled to run immediately (ASAP) or at a future date, and can also be scheduled to recur, for example, daily, or weekly.

Every time the collection activity runs, a job is created to process it. Clicking the activity name displays a list of all jobs that ran for the activity.

In the Jobs View, statistics are listed for each indexing job, including:

l The start time

l The end time

l The number of items that are processed, succeeded, and failed

Successful items are further broken down by the number of duplicate items, and updated items:

l Duplicate items are unchanged, and appear in multiple backups

l Updated items are initially indexed for metadata only, and later updated to be indexed for full content

For activities with many clients, and clients with many backups or files, it is not unusual for indexing, and stats to take a significant amount of time. Indexing takes time to compile an internal list of files in the backups, and then divide into tasks. Also, it takes additional time to query backup servers for the information.

Search related jobs Using the companion search web UI, Search Admins can complete actions on selected search hits. The actions include downloading and restoring search hits, and creating long running queries filtered by the backup date. These jobs can be tracked in the Search UI, and are also listed in the Jobs View in the Admin UI.

System jobs To maintain the system, Data Protection Search runs background jobs at regular intervals. These system jobs include checking the status of Worker nodes, configured

Jobs

76 Data Protection Search 1.1.x Installation and Administration Guide

backup servers, and garbage collection activities. Garbage collection removes files from the index that no longer exist on the backup server.

Status jobs run hourly, garbage collection run jobs daily, and a special garbage collection reconciliation job runs monthly. The garbage collection reconciliation job synchronizes the backup information stored in DPSearch with the information in the backup servers.

By default, system jobs do not appear in the Jobs View. To see them, check the System settings icon ( ).

Job types and statuses To determine a job type or status, from the Jobs window, hover over the job icon.

To view the list of jobs, click Administration > Jobs.

The following table describes the job types.

Table 14 Job types

Job types Description

Metadata

Performing a metadata index of search results. The default indexing method is metadata only, and basic metadata is collected for each file.

Full content index

Performing a targeted full content index of search results.

Restore

Restoring files to their original location or to an alternate location.

Long query

Scanning the source to prepare an index.

System

Performing system jobs and services, including monitoring the system components, refreshing client lists and backups, and running garbage collection to remove expired backups from the index.

Note

System jobs are hidden by default. They can be seen by updating the filters.

Download

Downloading a search result.

The following table describes the task status.

Table 15 Task status

Status Description

Spawning

Jobs

Job types and statuses 77

Table 15 Task status (continued)

Status Description

Indicates a job has been triggered.

Pending

Indicates a job is waiting to begin.

Running

Indicates a job is currently being performed.

Stopping

Indicates a job is being stopped.

Success

Indicates a job has been performed successfully.

Failed

Indicates a job has failed.

Stopped

Indicates a job has stopped.

Timeout

Indicates a job has timed out.

If a source job is still running after 10 minutes, the job is marked as a timeout.

Paused

Indicates a job has been paused.

A job can be paused manually.

A job pauses automatically when minimal disk space is available.

Pausing

Indicates a job is being paused.

Resuming

Indicates a paused job is restarting.

Jobs

78 Data Protection Search 1.1.x Installation and Administration Guide

Monitor search jobs Search jobs include full content indexing (FCI) and downloading.

To monitor search jobs as a Search User or Search Administrator, select Search > View Jobs.

To monitor jobs as an Application Administrator, select Administration > Jobs. Use the following procedure to limit the jobs view to search jobs only.

Procedure

1. Click Administration > Jobs.

The Jobs window opens.

2. To open the list of available job filters, click .

The filters menu appears.

3. To narrow the jobs list to display only search jobs, ensure that only the Full content index filter is selected.

4. Click Apply.

Only search jobs display in the job list.

5. To view a detailed list of tasks for a job, click View tasks.

The status of each task appears.

6. To refresh the list of jobs, click .

Monitor index jobs Procedure

1. Click Administration > Jobs.

The Jobs window opens.

2. To open the list of available job filters, click .

The filters menu appears.

3. To narrow the jobs list to display only index jobs, ensure that only the following job types are selected:

l Metadata

l Full content index

l Long query

4. Click Apply.

Only index jobs display in the job list.

5. To view a detailed list of tasks for a job, click View tasks.

The status of each task displays.

6. To refresh the list of jobs, click .

Jobs

Monitor search jobs 79

Monitor system jobs System jobs run at typical intervals. By default, system jobs do not appear in the Jobs view.

To view system jobs, use the following procedure.

Procedure

1. Select Administration > Jobs.

The Jobs window opens.

2. To view the system jobs together with all job types, click .

System jobs appear with all other jobs.

3. To narrow the jobs list to display only system jobs:

a. Click .

The filters menu appears.

b. Ensure that only the System filter is selected.

c. Click Apply.

Only system jobs appear in the list of jobs.

4. To view a detailed list of tasks for a job, click View tasks.

The status of each task displays.

5. To refresh the list of jobs, click .

View system health You can view storage and memory thresholds and current values for the CIS and Worker nodes in the System window.

Procedure

1. To view the health of the system, select Administration > System.

The value in the Current column is color coded according to the following threshold values:

Color Threshold value

Green Acceptable

Yellow Near threshold

Red Exceeds threshold

When thresholds exceed acceptable values, the system generates notifications.

Jobs

80 Data Protection Search 1.1.x Installation and Administration Guide

Note

If the amount of available data disk space falls below the lower threshold of the required space, all active jobs are paused. Manually resume the paused jobs after more disk space becomes available.

2. To view system notifications, select Administration > Dashboard > Notifications > More.

View system services You can view the components and status of the Worker nodes and CIS nodes in the System window.

Note

The status updates every 10 minutes. Clicking the refresh button does not cause the status to update more quickly.

Procedure

1. To view system services, select Administration > System.

When the status changes, the system generates notifications.

Note

If a service stops running, the system tries to restart the service automatically. If the problem that stopped the service is unresolved, the system might not restart, for example, when the system is out of disk space or the network is down. After a service restarts, the UI might not reflect the change in status until the next 10 minute refresh cycle.

2. To view system notifications, select Administration > Dashboard > Notifications > More.

Jobs

View system services 81

Jobs

82 Data Protection Search 1.1.x Installation and Administration Guide

CHAPTER 9

System

This section includes the following topics:

l Monitoring system health and services.............................................................. 84

System 83

Monitoring system health and services In the System window, you can view detailed information on all Data Protection Search Worker Nodes and Common Indexing System (CIS) Nodes.

To view the health of the system and system services, click Administration > System.

View worker node health You can view storage and memory thresholds, current values, components, and status for the worker nodes in the System window.

Dedicated worker nodes allow for faster indexing of backups and spread out other processing tasks, such as downloads and restores.

System The threshold values are hard coded. The current usage is listed for CPU, memory, and disk space.

Note

Values are updated at an interval of one hour. After a system restart, memory and CPU utilization appear high.

The value in the Current column is color coded according to the following threshold values:

Color Threshold value

Green Acceptable

Yellow Near threshold

Red Exceeds threshold

When thresholds exceed acceptable values, the system generates notifications.

Note

If the amount of available data disk space falls below the lower threshold of the required space, all active jobs are paused. Manually resume the paused jobs after more disk space becomes available.

To view system notifications, click Administration > Dashboard > Notifications > More.

Components View the current version of the following components:

l Worker Node

l Admin web

l Admin API

l Search web

l Search API

System

84 Data Protection Search 1.1.x Installation and Administration Guide

Note

The version numbers might vary depending on patch levels and applied hotfixes.

Agents View the installed agents and the status for each Avamar and NetWorker backup server. When the status changes, the system generates notifications. The status updates every 10 minutes.

Note

If a service stops running, the system tries to restart the service automatically. If the problem that stopped the service is unresolved, the system might not restart, for example, when the system is out of disk space or the network is down. After a service restarts, the UI might not reflect the change in status until the next 10 minute refresh cycle.

View CIS node health You can view storage and memory thresholds and current values for the CIS nodes in the System window.

A CIS node is a Data Protection Search component that provides a security layer on top of Elasticsearch.

The threshold values are hard coded. The current usage is listed for CPU, memory, and disk space.

Note

Values are updated at an interval of one hour. After a system restart, memory and CPU utilization appear high.

The value in the Current column is color coded according to the following threshold values:

Color Threshold value

Green Acceptable

Yellow Near threshold

Red Exceeds threshold

When thresholds exceed acceptable values, the system generates notifications.

Note

If the amount of available data disk space falls below the lower threshold of the required space, all active jobs are paused. Manually resume the paused jobs after more disk space becomes available.

To view system notifications, click Administration > Dashboard > Notifications > More.

System

View CIS node health 85

System

86 Data Protection Search 1.1.x Installation and Administration Guide

CHAPTER 10

Options

This section includes the following topics:

l Data Protection Search Options........................................................................ 88 l Configuring Email notifications.......................................................................... 89

Options 87

Data Protection Search Options The Options section of the Data Protection Search Admin UI provides the ability to configure and modify system and search options.

The following figure illustrates the Options section of the Admin UI.

Figure 12 Options

The following table lists the available options.

Table 16 System and search options

Option Description

Search Options Set the max hits to restore to limit the number of files that can be restored in one session. The range is from 101 to 5000 files.

Session Options Set the session inactivity time out for both the Admin UI and the Search UI. The default is 1 hour before a login is required. You can set the range from 3 minutes to 24 hours.

LDAP Options Modify the LDAP users/options (host) specified at deployment.

Options

88 Data Protection Search 1.1.x Installation and Administration Guide

Table 16 System and search options (continued)

Option Description

Email Notification Options

Enable email notifications and set SMTP options.

Index Options Set the value for the following:

l File types to exclude from FCI (list the file name extensions to skip during full content indexing)

l Number of replicas

l Apply Replicas settings to existing indexes

The following table lists jobs and activities that trigger notifications.

Table 17 Supported notifications

Activity Notification trigger

Restore operation The notification is sent to the Search Admin, when a restore operation completes, succeeds, fails, stops, or times out.

Collection job (metadata or full- content)

The notification is sent to the Admin that created the activity or forced a Run now, when a collection job completes, succeeds, fails, stops, or times out.

Index state The notification is sent to all Index Admins, when an index changes state (health, warning, or error).

Backup server state The notification is sent to all DPSearch Admins, when the backup server changes state (healthy or disconnected).

Configuring Email notifications You can configure Data Protection Search to send email notifications to specified SMTP users and hosts.

The following figure illustrates the Email Notification Options to configure.

Options

Configuring Email notifications 89

Figure 13 Configure email notifications

Procedure

1. Click to toggle Email notifications from OFF to ON to configure email notifications for Data Protection Search.

Email notifications are disabled (OFF) by default.

2. Configure the following for email notifications:

l SMTP (IP of the SMTP host)

l Port (Set an appropriate port, typically 25 or 587)

l SMTP User (An account authorized to connect to the SMTP service)

l Password

Note

Notifications are always enabled, and can be viewed in the Notifications section of the dashboard. These options relate only to whether email notifications are sent.

3. Click Validate.

Results

Click to select the notification and view its details. Email Notifications take effect 15 minutes after this setting is enabled.

Options

90 Data Protection Search 1.1.x Installation and Administration Guide

CHAPTER 11

Performing Searches

This section includes the following topics.

l About Searches..................................................................................................92 l Access the Search window................................................................................ 92 l Optimize search performance............................................................................ 93 l Optimize search results......................................................................................93 l Narrow a search by NetWorker or Avamar.........................................................95 l Narrow a search by file type.............................................................................. 96 l Narrow a search by file attribute........................................................................97 l Narrow a search by date and time attributes..................................................... 98 l Include content that was not indexed in the search........................................... 99 l Using keywords to search.................................................................................. 99 l Restore files in Data Protection Search............................................................ 103 l Search results...................................................................................................104 l Search criteria management.............................................................................105 l Search performance factors............................................................................. 108

Performing Searches 91

About Searches Data Protection Search performs metadata searches on indexed files and folders, and full content searches on files that have had their full content indexed. Data Protection Search supports keyword and advanced Lucene search queries. Search results include a summary, such as filename, location, and backup client.

Search provides the following highlighting techniques:

l If a file is indexed for metadata, keywords or phrases that are found in the name of the file name or path are highlighted.

l If a file is indexed for full content indexing (FCI), keywords or phrases that are found in the body of the file are returned with a preview of the full contents. The preview displays, and highlights the words around the matched text.

By default, the search is sorted by Relevance, which defines how the search results match the search criteria.

The following figure illustrates the Search section of the Data Protection Search UI.

Figure 14 Search UI

Access the Search window The Data Protection Search virtual appliance is accessed as a web-based interface. Depending on your user privileges, you can perform administration or search tasks.

To perform searches, you must be assigned to one of the following roles:

l DPSearch Admin

l Index Admin

Performing Searches

92 Data Protection Search 1.1.x Installation and Administration Guide

Note

You must log in to the Data Protection Search UI each time it is opened, or after an inactivity time out of 20 minutes or more. The default inactivity timeout is 20 minutes.

To perform searches, complete the following steps.

Procedure

1. In a web browser, type the following:

https://nodename/search

Note

You might be required to acknowledge a browser warning regarding self-signed certificates before continuing.

2. Log in:

a. Type the username and password.

Note

The default username is Admin. A System Administrator or Application Administrator can assign roles to users.

b. Click Sign In.

The Search UI opens.

Optimize search performance At the core of Data Protection Search is Elasticsearch, a high-performance indexing and search system capable of searching billions of objects within seconds.

To leverage Elasticsearch replication, at least two nodes are required, although a single node is supported.

To further increase the speed at which search results are returned, observe the following best practices:

l Add specific queries only if you know the details of the content that you are looking for. For example, specify the file type, date, and owner using the search filters instead of typing them directly into the Search bar.

l Limit the number of collections that are running for indexes during a search operation. Collections can slow down some types of queries.

l Limit the number of simultaneous queries by different users.

l Use search filters to narrow the search scope and limit the number of results.

Optimize search results You can use one or a combination of the following filter options to narrow and optimize search results:

l Index

Performing Searches

Optimize search performance 93

l Platform

l Server

l Client

l File Type

l File Name

l Size (in KB)

l Last Modification Date

l Backup Date

l Location

l Unindexable Content

The following figure illustrates the Search section of theData Protection Search UI.

Figure 15 Search UI

Change search filter Procedure

1. In the Search window, click Add | Remove Filters.

2. Click Save and Close.

View search jobs Procedure

1. In the Search window, click View Jobs.

2. Click Save and Close.

Performing Searches

94 Data Protection Search 1.1.x Installation and Administration Guide

Narrow a search by index type Procedure

1. From the product Search window, click Index.

2. In the Index field list, check or uncheck indexes from the search.

Narrow a search by NetWorker or Avamar You can narrow the search results by applying one or more of the following search filters.

Specify a platform Procedure

1. Open the Search window, by clicking .

The Search window appears and displays the search options.

2. Click Platform.

3. In the Platform field list, select one of the following options:

l All

l Avamar

l NetWorker

Specify a server Procedure

1. Open the Search window, by clicking .

The Search window appears and displays the search options.

2. Click Server.

3. In the Server field list, select the name of an Avamar or NetWorker server.

4. To display the Source Distribution Chart for Backup Servers and Clients, click

.

Specify a client Procedure

1. Open the Search window, by clicking .

The Search window appears and displays the search options.

2. Click Client.

3. In the Client field list, select an Avamar or NetWorker client.

Performing Searches

Narrow a search by index type 95

Narrow a search by file type To narrow the search by file type, perform one of the following steps from the Search window.

Search by file type To narrow the search by file type:

1. Click File Type.

2. In the File Type field, type a file name extension or multiple file extensions that are separated by commas.

3. In the File Type field, click . The pie chart breaks down the current search by frequency of file type:

l Only the top 10 most frequent file types in the search display.

l Each file type extension is listed below the pie chart.

Focus on specific file types To narrow the view of the pie chart:

1. In the File Type field, click .

2. Click a section of the pie chart or a file name extension from the legend. The search results for a specific file type appear.

Performing Searches

96 Data Protection Search 1.1.x Installation and Administration Guide

Figure 16 File Type Distribution Chart (Top 10)

Narrow a search by file attribute You can narrow the search results by applying one or more of the following search filters.

File name Procedure

1. Open the Search window, by clicking .

The Search window appears and displays the search options.

2. Click File Name.

3. In the File Name field, select one of the following options:

l Contains with (default)

l Begins with

l Ends with

4. In the File Name field, type the file name.

Performing Searches

Narrow a search by file attribute 97

The file name can include wildcard characters (*).

File size Procedure

1. Open the Search window, by clicking .

The Search window appears and displays the search options.

2. Click Size (in KB).

3. In the Size (in KB) field, specify a minimum or maximum file size.

4. To view the File Size Distribution Chart:

a. In the Size (in KB) field, click .

b. Click a section of the bar chart.

The search results appear.

File location Procedure

1. Open the Search window, by clicking .

The Search window appears and displays the search options.

2. Click Location.

3. In the Location field, type a file path, and then enclose the file path in quotations.

For example: "/ifs/data/projects" 4. To search for a keyword inside of the file path, type the keyword with no extra

characters.

Narrow a search by date and time attributes You can narrow the search results by applying one or more of the following search filters.

Last modification date The date represents the date that the file or folder was last modified.

Procedure

1. Open the Search window, by clicking .

The Search window appears and displays the search options.

2. Click Last modification date.

3. In the Last modification date field, type a start or end date.

4. To view the Last Modification Date Distribution Chart:

a. In the Last modification date field, click .

Performing Searches

98 Data Protection Search 1.1.x Installation and Administration Guide

b. Click a section of the bar chart.

The search results appear.

Backup date Procedure

1. Open the Search window, by clicking .

The Search window appears and displays the search options.

2. Click Backup Date .

3. In the Backup Date field, type a start or end date.

Include content that was not indexed in the search To search for items that were not successfully full content indexed, consider the following factors:

l File type or size.

l Text was not extracted from the file.

Procedure

1. Open the Search window, by clicking .

The Search window appears and displays the search options.

2. Click Unindexable Content.

3. Select Yes.

Using keywords to search To perform a keyword to search, use one of the following procedures.

Perform a basic search Procedure

1. Open the Search window by clicking .

The Search window appears and displays the search options.

2. In the Search dialog box, type a keyword.

3. Click Search.

The Search Results page appears.

4. (Optional) To view the file metadata, in the Search results page, click More.

5. (Optional) To download a file, full access permissions are required:

Note

Downloading a file retrieves the file from the backup server to the Data Protection Search node. You can download files locally only if you are the owner of the file or if you have search administrator rights with all access privileges.

Performing Searches

Backup date 99

a. On the Search Results page, select Download from the list of actions.

b. Beside the files that you want to download, select a checkbox.

c. Download the file:

l To download all the file entries that appear on the page, click Select All.

l To download only the selected files, click Selected.

l To download search hits that match the current query, click All.

d. (Optional) To monitor progress of the download, click View Jobs.

6. To change the criteria on which the results are sorted, click another item from the Sort By (DESC) filter.

Relevance (sort score) is the default.

Figure 17 Search filters

7. To view a text representation of the original file which is pulled from the index, click Preview

Full content indexing provides all the text without the original application formatting. A near real representation of the file is displayed with enough data to identify the file. Image files have a thumbnail instead.

Figure 18 Example preview

Performing Searches

100 Data Protection Search 1.1.x Installation and Administration Guide

Note

It is not be possible to create a preview for all files. Sometimes, a preview is not available. If a full content indexed file contains less than 2 MB of text, the preview includes the original formatting, where possible. If there is more than 2 MB of text, only the text itself is in the preview, without formatting. Embedded images are never in the preview, regardless of file size. If the original file type is an image, the preview might be available with a smaller size of the picture.

Perform an advanced search by using a Lucene query Procedure

1. Open the Search window by clicking .

The Search window appears and displays the search options.

2. In the Search dialog box, type a Lucene query.

3. Click Search.

The Search Results page appears.

The following table includes Lucene syntax examples.

Table 18 Example Lucene syntax

Syntax examples Description

field:value The field contains the value

field:"phrase" The field contains the exact phrase

field:(value1 OR value2) The field contains one or both values

_ missing_:field The field is missing a value

_ exists_: field The field has a non-null value

field:qu?ck bro * Use wildcards

xvname:/joh?n(ath[oa]n)/ Use regular expression

field:quikc~brwn~foks~ Words that are similar

field:"fox quick"~ 5 No more than 5 words between them

xvdate:[2003-01-01 TO 2004-01-01] Match any day in range

xvdate:{TO 2012-12-31} Specified date search

xvsize:>10000 All files with sizes over 10,000 bytes

quick brown +fox -news The word fox must be present, news must not be

4. (Optional) To view the file metadata, in the Search results page, click More.

5. (Optional) To download a file, full access permissions are required:

Performing Searches

Perform an advanced search by using a Lucene query 101

Note

Downloading a file retrieves the file from the backup server to the Data Protection Search node. You can download files locally only if you are the owner of the file or if you have search administrator rights with all access privileges.

a. On the Search Results page, select Download from the list of actions.

b. Beside the files that you want to download, select a checkbox.

c. Download the file:

l To download all the file entries that appear on the page, click Select All.

l To download only the selected files, click Selected.

l To download search hits that match the current query, click All.

d. (Optional) To monitor progress of the download, click View Jobs.

6. To change the criteria on which the results are sorted, click another item from the Sort By (DESC) filter.

Relevance (sort score) is the default.

Figure 19 Search filters

7. To view a text representation of the original file which is pulled from the index, click Preview

Full content indexing provides all the text without the original application formatting. A near real representation of the file is displayed with enough data to identify the file. Image files have a thumbnail instead.

Performing Searches

102 Data Protection Search 1.1.x Installation and Administration Guide

Figure 20 Example preview

Note

It is not be possible to create a preview for all files. Sometimes, a preview is not available. If a full content indexed file contains less than 2 MB of text, the preview includes the original formatting, where possible. If there is more than 2 MB of text, only the text itself is in the preview, without formatting. Embedded images are never in the preview, regardless of file size. If the original file type is an image, the preview might be available with a smaller size of the picture.

Restore files in Data Protection Search Data Protection Search provides the ability to restore files back to their original location on the client from which they were originally backed up. Files can also be restored to an alternate location on that client, or to an entirely different client on the same backup server. However, Only Search Admins with full access have permissions to restore files to a different client.

Use the following procedure to restore individual, multiple, or all the files to their original location or an alternate location (within the restrictions of the backup server). Preview is available for search hits resulting from full-content indexed searches to verify that you have located the correct files to restore.

Note

NDMP files can typically only be restored to a NAS device of the same type. For example, VNX to VNX, or Isilon to Isilon. Avamar NDMP files can also be restored to a file system client.

Procedure

1. Click Restore below an individual file.

2. Click the checkbox beside multiple files, and then click Restore.

3. To select all files on the current page:

a. Click the checkbox above the search results.

b. Click Restore.

Performing Searches

Restore files in Data Protection Search 103

Note

When restoring multiple files, ensure all files come from the same backup server. Choosing to restore multiple files from different backup servers might cause the restore operation to fail.

Search results Different search results, controls, and status are displayed and available for full access Search Admins versus read-only Search Admins.

Table 19 Search results for full access versus read only search admins

Search result component

Full access Search Admin Read only search Admin

Application icon Displayed Displayed

Pathname Active link to download/open the file No active link to download/open the file

File name Displayed Displayed

Last change date

Displayed Displayed

Client Displayed Displayed

Plugin Displayed Displayed

Contextual snippet

Displayed if one or more keywords was found in the full content (body) of the file

No contextual snippet displayed

Platform Same for full, or read only access Displayed

File size Displayed Displayed

Backup server name

Displayed Displayed

The following is a list of the Data Protection Search controls for search results:

l Display for the number of matching results

l To process the query, displays the time that is taken to complete.

l To view restore jobs, displays a dialog box

l Change sort criteria

l Restore, Download long query jobs

l Display a Restore option for individual files

l Displays a Preview option for individual files (Preview is disabled for read-only Admins)

Note

Certain file types, including .log, .exe, .dll and .bin files are skipped for full content indexing, and are therefore not available to preview.

l Displays a More option to open a window to view additional details

Performing Searches

104 Data Protection Search 1.1.x Installation and Administration Guide

l Checkboxes to select multiple items

l Top level checkbox to select all items

l Top level Restore option for multiple items

l Display the number of matching search results

Restore, download, and query jobs for the logged in user and the respective results for downloads and queries are shown in View Jobs.

The following table describes the specific information available when you click to view More for a file.

Table 20 Details

More Info item Description

Index Information

Index Name The full name of the index

Full content indexed? True/False (Metadata index if false)

Index Date The date the file was indexed

Item Information

Client operating system The operating system for the backup client

Created Date The date the file was created and item specific metadata. Available only for Windows backups on the NetWorker server, excluding Linux and UNIX backups

Title Only available for full content indexed and found in item-specific metadata

Author

Subject

Backups All backups containing files are listed here. Listed for each file:

l Backup date

l Backup number

l Expiration date

l Full item-specific metadata information (list of name/values pairs)

Search criteria management To narrow and reduce the time that is taken to return results, add search criteria.

The following lists the available Search Criteria:

l Index

l Platform

l Server

l Client

l File Name

Performing Searches

Search criteria management 105

l File Type

l Size

l Last Modification Date

l Backup Date

l Location

l Unindexable Content

Index Lists the indexes that the currently logged in Search Admin has access to. For example, searchadmin1 might have access to index 1 and 2, and searchadmin2 might have access to indexes 2, 3 and 4. To remove indexes from the search, uncheck them.

Platform The following filters are available for the Platform criteria:

l All (default)

l Avamar

l NetWorker

Server Specify the name of an Avamar or NetWorker server. There is a visual filter available for this option.

Client Select a backup server and then specify the name of an Avamar or NetWorker client.

File Name The following table lists the available filters for the File Name criteria.

Table 21 File Name criteria

File Name value Description

Contains (default) Specify a keyword contained in, beginning, or ending with the file name. Can include wildcard characters (*).

Begins with

Ends with

File Type Narrow the search by File Type by typing a file name extension in the dialog box or using the pie chart icon. The pie chart breaks down the current search by frequency of file type. Only the top 10 most frequent file types in the search are shown. Each file type extension is listed below the pie chart. To narrow the search by file type, perform one of the following steps:

l Click the Pie chart icon.

l To eliminate that type of file from the current search, click a file type extension below the pie chart.

l To limit the search to only that type of file, click a file type extension in the pie chart.

Performing Searches

106 Data Protection Search 1.1.x Installation and Administration Guide

Figure 21 File Types

Size Specify the file size (always in KB). To display a visual breakdown of the current search results, click the bar chart icon next to the size control. The frequency that each range of file sizes occurs is represented in a bar chart. To rerun the current search filtered by that size range, click the bar.

Last Modification Date Specify the date on which the file was last modified. Greater than, less than, or between also has visual filter.

For the Last Modification Date criteria, jobs are listed in the View Jobs list as Query jobs. When the job completes, click Query Result to view the details.

Backup Date To display a visual breakdown of the current search results for the backup date, click the bar chart icon next to the date control. Click a particular year to divide the results by month. To rerun the current search filtered by that date range, click a month.

Note

Using a wide range for the Backup Date criteria to search for a string can result in a long search window and negatively impact performance. Restrict the range to a single backup date for this filter to avoid performance issues. To enhance performance, jobs run in the background.

The following table lists the available filters for the Backup Date criteria.

Performing Searches

Search criteria management 107

Table 22 Backup Date criteria

Backup Date value Description

Greater Than Specify a backup date value for which the search returns hits between that date and now.

Between To limit the search results to the specified time period, include two date values.

Less Than Specify a backup date value for which the search returns hits from before that date.

Location Specify the location of the file by using the physical path or just segments of the path.

Unindexable Content Select Yes or No for this search criteria. DPSearch can find (or exclude) files that could not be full-content indexed.

Search performance factors The speed at which search results are returned is dependent on a number of factors, including the following:

l How many items are in the indexes being searched

l If the Index filter is used to restrict searches to a specific index or indexes

l The number of configured Index Data nodes More nodes to distribute processing increases search performance

l The number of simultaneous queries Multiple Index Data nodes enhance performance by distributing the processing across the nodes

l Replicas can speed up performance If an index is static, performance improves, only in a multi-node cluster

l If collections are running for indexes during a search operation Collections can slow down some types of queries

l Search scope Search filters narrow the search scope, and improve search performance

Also, a search that is applied to a billion items is slower than one with millions of items. Filters can be combined for greater benefit. For example, all .jpg files modified in the last year from client "my-sles-client".

Performing Searches

108 Data Protection Search 1.1.x Installation and Administration Guide

CHAPTER 12

Troubleshooting

This section includes the following topics:

l The Data Protection Search log files................................................................. 110 l Viewing and filtering log files with Data Protection Search log viewer............... 111 l Troubleshooting the Data Protection Search web server................................... 111 l Troubleshooting web services for collector issues.............................................113 l Increasing the maximum memory for the dpworker service...............................115 l Elasticsearch troubleshooting........................................................................... 115

Troubleshooting 109

The Data Protection Search log files Troubleshooting provides information on the log files available for Data Protection Search components.

The Data Protection Search default log directory, /usr/local/dpsearch/logs, has the following logs:

l Worker service (dpworker.log)

l Web search GUI (dp_search_web.log)

l Web search API (dp_search_api.log)

l Web Admin GUI (dp_admin_web.log dp_admin_web.log)

l Web Admin API (dp_admin_api.log dp_admin_api.log)

Use a secure FTP client, such as WinSCP or PuTTy (psftp) to copy log files from the Data Protection Search nodes to a Windows computer, as described in the following sections.

Note

The WinSCP tool provides a GUI, and retains the settings, including both local and remote directory locations.

Copying log files by using WinSCP The WinSCP tool has an advantage over other tools as it provides a GUI, and retains the settings, including both local and remote directory locations.

Before you begin

Install WinSCP by downloading WinSCP from winscp.net, and following the prompts in the wizard. Procedure

1. Select Stored sessions, and click New.

2. Add Session by completing the following fields:

l Hostname

l Port number (default is 22)

l Root username

l Password

3. Click Directories and complete the following fields:

l Remote directory: Type /usr/local/dpsearch/logs l Local directory: /local directory of the choice

4. To save the session, click Save, and then click Login.

5. Drag the logs for which to view from the remote directory section (/usr/ local/dpsearch/logs) of the window to the local directory section of the window.

Troubleshooting

110 Data Protection Search 1.1.x Installation and Administration Guide

Copying log files by using PuTTy Before you begin

PuTTy must be installed. Download PuTTy from winscp.net, and follow the prompts in the PuTTy installation wizard.

Use a secure FTP client, such as PuTTy (psftp) to copy log files from the Data Protection Search nodes to a Windows computer.

Procedure

1. Log in with the Data Protection Search credentials, or the default username and password, dpsearch/dpsearch.

2. Change to the log directory:

cd /usr/local/dpsearch/logs

3. Use the mget* command to download all the log files, or download the log files individually.

4. Unzip the log files if required.

Older versions of the logs are compressed based on size/date.

Viewing and filtering log files with Data Protection Search log viewer

The Data Protection Search log viewer tool is a separate executable that accompanies the Data Protection Search binaries. The Data Protection Search log viewer provides the ability to browse and filter log files.

Before you begin

Download DPSearchLogViewer.exe, and run the executable to install the Data Protection Search log viewer.

Procedure

1. In the Open window, browse to the directory in which the log files are.

2. To hide or show errors, warnings, information, and verbose trace messages (if enabled), click the icons in the toolbar.

Hide info/verbose messages to quickly identify errors and/or warnings.

3. Filter logs by Module, Process, or Machine and specify specific strings.

4. To view, select the file, and click Open.

Troubleshooting the Data Protection Search web server There are log files, commands, and configuration files for troubleshooting the Data Protection Search web server.

Web server control commands This section lists the service commands available for managing the web server (NGINX):

l service nginx reload

Troubleshooting

Copying log files by using PuTTy 111

l service nginx stop

l service nginx start

l service nginx reopen

Configuration files The following table lists the available configuration files.

Table 23 Available configuration files

Configuration file Description Location

nginx.conf Defines ports and SSL certificates.

/etc/nginx/

system.conf Provides the ability to manage the DPSearch log files.

/usr/local/dpsearch/ etc/

The default location for the nginx.conf, and system.conf files is /etc/nginx.

Web server logs View the log files by using Data Protection Search log viewer on a Windows computer.

The Data Protection Search log names and locations are listed here:

l /usr/local/dpsearch/logs/dp_admin_api.log l /usr/local/dpsearch/logs/dp_search_api.log l /usr/local/dpsearch/logs/dp_admin_web.log l /usr/local/dpsearch/logs/dp_search_web.log

Data Protection Search configuration files You can use a text editor like vi, or vim to edit the nginx.conf and system.conf files on the Linux terminal.

The following sections provide more information on editing the Data Protection Search configuration files.

Edit the Data Protection Search nginx.conf file Use a text editor like vi, or vim to edit the nginx.conf file on the Linux terminal to edit the nginx.conf file.

Use the nginx.conf file to define ports, and manage SSL certificates and keys. To edit the Data Protection Search nginx.conf file, perform the following tasks:

Procedure

1. With a text editor, open the following file:

root /etc/nginx.conf 2. If needed, modify the ports, SSL certs and keys for the following:

l root /usr/local/dpsearch/httpds (Admin UI and Search UI)

n Port 443 (default)

n SSL_certificate dpsearch.cert n SSL_certificate_key dpsearch.key

l root /usr/local/dpsearch/httpds/admin/api/public (Admin Rest API)

Troubleshooting

112 Data Protection Search 1.1.x Installation and Administration Guide

n Port 448 (default)

n SSL_certificate dpsearch.cert n SSL_certificate_key dpsearch.key

l root /usr/local/dpsearch/httpds/search/api/public (Search Rest API)

n Port 449 (default)

n SSL_certificate dpsearch.cert n SSL_certificate_key dpsearch.key

3. To save the changes, restart NGINX.

Manage logs with the system.conf file

The system.conf file provides the ability to manage the following:

l Log file location

l log level

l log count

l log size

Procedure

1. Open the usr/local/dpsearch/etc/system.conf file with the text editor.

2. Modify any of the following.

l Log file location:

n log_path_admin_api: /logs/dp_admin_api.log (default)

n log_path_admin_web: /logs/dp_admin_web.log (default)

n log_path_search_api: /logs/dp_search_api.log (default)

n log_path_search_web: /logs/dp_search_web.log (default)

l Log level:

n 0 = verbose

n 1 = info (default)

n 2 = warning

n 3 = error

l Log count The number of old log files to keep (10 default)

l Log size The threshold at which the log files roll over (100,000 default)

3. Restart NGINX for the changes to take effect.

Troubleshooting web services for collector issues There are log files, commands, and configuration files available for troubleshooting the Data Protection Search Worker service.

Collector service control commands The following are the commands available for troubleshooting the Data Protection Search Collector service:

Troubleshooting

Troubleshooting web services for collector issues 113

l sudo/sbin/service dpworker status l sudo/sbin/service dpworker stop l sudo/sbin/service dpworker start l sudo/sbin/service dpworker restart Worker service log file The dpworker log is at, /usr/local/dpsearch /logs/dpworker.log. View the log file by using the Data Protection Search log viewer on a Windows computer.

The dpworker.log rolls over at 100 MB, or every 7 days by default. You can modify the settings in the /usr/local/dpsearch/etc/log4j2.xml file. To save the changes, restart dpworker.

Table 24 Example log messages

Log message Description

INDEX FAIL: Connect failed to retrieve item

Cannot find the item on local disk restored by connector

INDEX FAIL: Cannot read retrieved local item

Cannot read the item on local disk. Usually, the error is a permission issue

UNINDEXABLE: Failed to scrap text Tika cannot process the item

UNINDEXABLE: System error when scrapping text

JVM error when Tika is processing this item. Out-of-memory is the prime cause

INDEX FAIL: BulkResponse with fail message

Elasticsearch reports that it cannot process the index/update sub-request for this item

INDEX FAIL: Bulk request failure Elasticsearch might be unavailable, a network issue, and so on

Table 25 Example Avamar and NetWorker log messages

Log message Description

Recover: Failed to recover save sets: Recover path 'Document.ott' not an absolute path

Bad restore path specified

Recover: Failed to recover save sets: Remote process exited with errors

Problem with the specified save set - check the NetWorker logs

Recover: Cannot contact media database on avamar70.es1dev.com: Program not registered

Bad clients that are specified as NetWorker restore destination

nsradmin: Program not registered. There does not appear to be a NetWorker server running on Avamar70.es1dev.com

Error querying NetWorker server (for example, wrong server name specified)

Cannot login to Avamar server. User login failure

The Avamar password is incorrect

Cannot restore Avamar backup: Miscellaneous error

The error occurs when some, or all files cannot be restored during an Avamar restore. Use the Avamar Administrator to view the exact detail of the failures

Troubleshooting

114 Data Protection Search 1.1.x Installation and Administration Guide

Increasing the maximum memory for the dpworker service If there is a lack of memory, and there is enough system memory, it can be increased. Memory issues occur when performing full content indexing on large files.

Procedure

1. Open the following file:

/etc/init.d/dpworker 2. Change the -Xmx memory value in the following line:

ARGUMENTS="-Xms1024M -Xmx3072M -XX:-UseGCOverheadLimit...

3. Save the file, and restart the dpworker service.

Elasticsearch troubleshooting This section includes the following topics.

Controlling the Elasticsearch service To control the Elasticsearch service, use the following command:

sudo/sbin/service elasticsearch stop To stop the dpworker service before stopping the Elasticsearch service, use the following command:

sudo/sbin/service elasticsearch start

Viewing Elasticsearch logs Procedure

1. Use one of the following commands to view the Elasticsearch logs:

l tail /var/log/elasticsearch.log l vim /var/log/elasticsearch.log

2. Download the logs to another computer by using psftp.

Note

You cannot use the Data Protection Search log viewer for Elasticsearch logs.

Viewing or changing Elasticsearch configuration To view or modify Elasticsearch configuration, use the following commands:

l more /etc/elasticsearch/elasticsearch.yml l vim /etc/elasticsearch/elasticsearch.yml

Troubleshooting

Increasing the maximum memory for the dpworker service 115

Monitoring the health of the Elasticsearch cluster Procedure

1. To check the health of the Elasticsearch cluster:

a. Log in to Data Protection Search as a DPSearch Administrator.

b. Click Administration > System.

c. Review the status of the Elasticsearch component.

2. Verify that the Elasticsearch node has been correctly connected to the cluster.

a. Review the following firewall configuration file:

/etc/sysconfig/SuSEfirewall2

b. Verify that the port 93009400 in the node network is open in the following:

FW_TRUSTED_NETS For example:

FW_TRUSTED_NETS=" 10.98.27.0/24,tcp,440:449 127.0.0.0/24,tcp, 9200 10.98.27.0/24,tcp,9300:9400

Troubleshooting

116 Data Protection Search 1.1.x Installation and Administration Guide

GLOSSARY

A

all-in-one node Data Protection Search can be deployed as an all-in-one node, that includes both Worker and Index Master/Data (Elasticsearch) components on a single VM. This all-in- one solution is capable of managing many billions of backup records.

However, for larger environments, or improved index/search performance, additional Worker and/or Index Data nodes can be added.

analyzer An Elasticsearch Analyzer is a set of rules defining how to convert text into indexed terms. An analyzer is comprised of Tokenizers and Token Filters. The analyzer is applied both at index time and search time.

Data Protection Search provides the ability for one of several default Elasticsearch Analyzers to be specified for each index created.

Apache Lucene Apache Lucene powers the underlying index and search technology of Elasticsearch. Essentially, Elasticsearch is an API and infrastructure built around Lucene.

Apache Tika Apache Tika is the toolkit Data Protection Search uses to extract text and item specific metadata from files that are full-content indexed.

B

backup Backup is used generically by Data Protection Search to refer to a single backup of an Avamar or NetWorker client.

Data Protection Search does not distinguish between different save sets. If there are multiple save sets covering different directories on the same client, they are treated as individual backups. If you specify backups for a particular day, all save sets on that day are included.

Data Protection Search only processes file system backups. VM, database, block based backups, and NDMP backups are ignored.

base distinguished name Base distinguished name (DN) of Users/Groups for Data Protection Search to use.

blackout window A daily time frame during which Data Protection Search does not process any collection jobs, downloads, or restore tasks for a particular source server. The blackout window can be used to avoid these activities during a backup or maintenance window.

C

client An Avamar or Networker client machine. Client machines can be workstations, PCs, or file servers.

Common Indexing System

Common Indexing System (CIS) is a Data Protection Search component that provides a security layer on top of Elasticsearch.

Data Protection Search 1.1.x Installation and Administration Guide 117

connectors Data Protection Search uses connector components to communicate with backup servers. There are separate connectors for Avamar and Networker servers.

D

dashboard The landing page for the Data Protection Search Admin UI. The dashboard provides a summary of the system health and status.

default gateway A default gateway is the node on the network used for IP addresses that do not match any other routes in the routing table. The default gateway is used to connect local Avamar or NetWorker servers to an external network, or the internet.

disable collection Sets a collection activity to the Inactive state, preventing it from running based on its current schedule.

download Search Admins with full access rights have the ability to download search hits by clicking on the filename. The file is initially restored to a Data Protection Search Worker node, and from there can be downloaded to a local machine by using the web browser.

DPSearch Admin DPSearch Admins are responsible for monitoring and administrating the Data Protection Search system. This includes adding and maintaining backup servers, monitoring jobs, and maintaining users and roles.

All DPSearch Admins must be a member of the DPSearch Admins Group added at deployment time. By default, DPSearch Admins are also Index Admins.

DPSearchLogViewer A tool that is shipped with Data Protection Search that can be used to view, filter and search Data Protection Search log files. The log viewer runs on Windows only.

dpworker DPWorker is the name of the Data Protection Search Linux service/daemon that implements worker functionality. This includes collections/indexing, downloads, restores, and system jobs.

If the worker service for a particular node is not running for some reason, there will be a message displayed in the Data Protection Search Admin Dashboard. At the terminal, the status of the service can be checked with service dpworker status, and it can be started with service dpworker start.

E

Elasticsearch A a search server based on Lucene. It provides a distributed, multitenant-capable full- text search engine with a RESTful web interface and schema-free JSON documents. At the terminal, the status of the service can be checked with service elasticsearch status, and, if necessary, it can be started with service elasticsearch start.

Elasticsearch cluster A collection of Elasticsearch Master and/or Data nodes.

Elasticsearch node A running instance of Elasticsearch. Each node can be Master and/or Data node, and together a group of nodes form an Elasticsearch cluster.

enable collection Sets a collection activity to the Active state, allowing it to run based on its schedule.

Glossary

118 Data Protection Search 1.1.x Installation and Administration Guide

End User License Agreement (EULA)

The End User License Agreement for Data Protection Search. The EULA must be accepted on each node before configuring Data Protection Search can be configured.

F

field The name of an indexed property in Elasticsearch. Similar in concept to a column in a relational database.

full-content index When full-content indexing is specified for a collection activity, the content of the file is indexed in addition to the metadata. This includes:

l Content - Text found in the body of the file

l Preview - An HTML representation of any text, or thumbnail for image files

l Item-specific metadata - Application specific name/value pairs (for example, Office metadata such as author, subject, and title)

Note

If Data Protection Search is unable to pull text or an image from a file, only the metadata is included. If the Unindexable content flag is set and can be used to include or exclude such items from a search.

G

garbage collection Garbage collection is used to schedule the removal of files from indexes that no longer exist on the backup server. The removal based on the internal cache of backup expiration dates maintained by Data Protection Search. See also Reconciliation.

H

head plugin A commonly used plugin for Elasticsearch. Can be used to monitor the cluster and index states, and run queries.

heap size During deployment, the heap size for Elasticsearch nodes must be specified. Generally Elasticsearch should be restricted to no more than half of the total memory of the virtual appliance,to avoid starving the system of resources for use by the operating system. In particular, the underlying Lucene components leveraged by Elasticsearch need to use off-heap memory for caching in-memory data structures.

I

index A repository for indexed metadata and content. Made up of primary and replica shards.

index admin Index Admins are responsible for maintaining Indexes and creating and monitoring Collection activities. When creating Collection activities, the index admin's permission level determines whether they have the ability to specify full-content indexing, or limited to metadata only indexing.

Glossary

Data Protection Search 1.1.x Installation and Administration Guide 119

Index data node A DPSearch virtual appliance is configured to include Elasticsearch components. A second hard drive for index storage is required.

Worker components are not included. Avamar and NetWorker clients are not required, and the node does not have to be registered on the NetWorker servers.

index master node The Index Master node is responsible for managing and routing requests (search queries and indexing requests) to all nodes within an Elasticsearch cluster. The Common Indexing Service (CIS) components are also on the Index Master node. There can only be a single Index Master node in a cluster, but it can communicate with many Index Data nodes. DPSearch Index Master nodes are also Index Data nodes.

item specific metadata Item specific metadata is specific to a particular file format/application. For example, Microsoft Office documents often have Author, Title, Subject, Word Count, and so on. Photos can have details of the camera, resolution, date taken, and so on. If a file is full- content indexed, this information is extracted and added to the index. Item specific metadata can be searched, and viewed (More Info).

L

language analyzer Data Protection Search supports a wide variety of language-specific analyzers. Each of these provides stemming and removes stopwords appropriate for that language.

ldap LDAP (Lightweight Directory Access Protocol) is a directory service protocol that is leveraged by Data Protection Search to communicate with directory services, such as Microsoft Active Directory. All users and groups within Data Protection Search are added from an LDAP server.

lucene syntax In addition to simple keywords, a powerful search query language can be entered into the search bar. This is powered by the underlying Apache Lucene technology.

M

mapping The schema/definition for an Elasticsearch index.

Glossary

120 Data Protection Search 1.1.x Installation and Administration Guide

metadata index When metadata only indexing is specified for a collection activity, only the file system and backup metadata will be indexed. For the file system this includes:

l Filename

l Path

l Size

l Date

l Extension

l MIME

l Type

For the backup server, this includes:

l Platform

l Item

l Type

l Server (ID)

l Client (ID and name)

l Client operating system

Details of backups/save sets in which the file was found is also stored, as is the date/ time the item was indexed.

more info Each search result for Data Protection Search has a More Info link below it. Clicking the more info link displays additional information for the search hit, including any item specific metadata and a list of all backups containing the file.

N

name server One or more DNS (Domain Name System) servers.

NGINX An open source, reserve proxy web server supporting load balancing. NGINX hosts the Data Protection Search Admin and search web applications, as well as the Common Indexing System components.

notifications Data Protection Search notifications are sent by email to report the status for restore operations, collection jobs, indexes and the backup servers. Notifications are also listed in, and can be viewed from System Notifications on the Data Protection Search Admin dashboard.

The notifications can be viewed from the Dashboard page of the Admin UI. Notifications require SMTP server.

O

OVF The Open Virtualization Format (OVF) file containing the definition of the Data Protection Search Virtual Appliance.

Glossary

Data Protection Search 1.1.x Installation and Administration Guide 121

P

passenger Data Protection Search uses Phusion Passenger as an application server within the NGINX web server.

platform Data Protection Search supports indexing of two backup platforms, Avamar and Networker.

plugins Elasticsearch supports a number of third-party plugins, such as Head, Kibana, and ElasticHQ. These can be used to monitor and query the Elasticsearch cluster and indexes.

Plugins can be installed for the Elasticsearch instances on Data Protection Search nodes. To access them remotely, port 442 must be specified, and the user must log in with the CIS admin account.

preview Each search result for Data Protection Search has a Preview link below it. Only full- content indexed items have an active Preview link, and only when a Search Admin with full access rights is logged in.

Clicking the Preview link displays a preview of the file stored in the index. For text files, this includes a simplified text only version of the file. For image files, this includes a thumbnail of the file.

primary shard Indexed documents go into the primary shard first. By default, there are 5 primary shards per index.

R

reconciliation Data Protection Search runs monthly reconciliation jobs to synchronize Data Protection Search with the information in the backup servers. Specifically, the backup expiration date that is used during garbage collection is updated.

replica shard A copy of the primary shard. Increases performance and failover.

restore Each search result for Data Protection Search has a Restore link below it, and a checkbox to multi-select and restore multiple items.

Items can be restored the original location, an alternate location on the same client, or to another client on the same backup server. Restore operations to another client on the same backup server is subject to restrictions imposed by the backup server itself. It is not possible to restore a Linux/Unix file to a Windows client (and vice versa).

Only Search Admins with full access rights are able to complete a restore operation to an alternate location.

run now Active scheduled collection activities can be forced to Run Now, rather than waiting for the next scheduled time for the activity to run.

Glossary

122 Data Protection Search 1.1.x Installation and Administration Guide

S

save set The smallest unit of data that NetWorker backs up. It is composed of one or more files and/or one or more file systems on a single client.

Data Protection Search does not distinguish between different save sets. If there are multiple save sets covering different directories on the same client, they are treated as individual backups. If you specify backups for a particular day, all save sets on that day are included.

scraped text The text that is extracted from a file by Apache Tika during full-content indexing.

search admin Search Admins have the ability to log on to the Data Protection Search Web interface and run searches. They are given permissions to particular indexes by Index Admins in the Admin UI. Search Admins have full access or read only permissions.

search filters Filters that can be applied to Data Protection Search searches. Filters narrow down the search scope and improve search performance. Search filters include filename, location (path), file type, size, last modification type, backup date, index, platform, server, client, and unindexable file content.

shard A single Lucene instance. Multiple shards make up an Index.

simple analyzer The Simple Analyzer can be used as an alternative to the Standard Analyzer when creating an index. Like the Standard Analyzer, it lowercases any text, but numeric values are discarded.

SLES Linux SUSE Linux Enterprise Server.

SMTP Simple Mail Transfer Protocol (SMTP) is an Internet standard for e-mail transmission. An SMTP server must be specified for Data Protection Search to send email notifications.

source server A source server from which content is processed for indexing. For Data Protection Search, aource servers are backup servers.

Spawning The status for a job that is claimed by the worker service, but has not yet started.

standard analyzer The default analyzer, and recommended for all by advanced users. This analyzer converts all text to lowercase, which means that searches will match regardless of case. The analyzer handles tokenization of email and other text strings. For Data Protection Search, the analyzer does not support stemming or stop words.

stemming Stemming modifies words to their basic root or stem. This allows keyword searches to match a broader set of hits.

stop words Stop words prevent unnecessary words such as "a", "and", "the" from being indexed. This results in smaller indexes, faster queries, and broader matches.

subnet mask The subnet mask is used to determine to which subnet an IP address belongs. It must be specified during deployment by using the yast tool. It can be entered as an IP mask such as 255.255.255.0, or by using Classless InterDomain Routing (CIDR) notation such as /24.

Glossary

Data Protection Search 1.1.x Installation and Administration Guide 123

SUSE Linux SUSE (Software und System Entwicklung) is an open source software company that develops and sells Linux products. Data Protection Search is deployed on SUSE Linux Enterprise Server.

system jobs Regular background activities that Data Protection Search runs to monitor backup servers, worker nodes, index nodes, and to handle garbage collection.

T

term A value that is indexed in Elasticsearch.

tokenization The process of breaking a string down into a stream of indexable terms or tokens.

U

Unicorn Unicorn is an http service that must be running on the Data Protection Search Index Master node for CIS to work.

V

view jobs The Data Protection Search search interface includes a link to the right of the search bar for viewing search-related jobs to be viewed. Only jobs launched by the current user are able to be viewed.

virtual appliance Data Protection Search is deployed a virtual appliance on VMware infrastructure. Each node is a Virtual Machine (VM) running SUSE Linux Enterprise Server (SLES).

visual filters Search filters that have a graphical element, allowing aggregated results to be viewed and interacted with. Server, File Type, Size, and Last Modification Date filters all have visual filters.

VMDK The Virtual Machine Disk file (VMDK) containing the Data Protection Search Virtual Appliance software that is deployed with the Data Protection Search OVF.

W

whitespace analyzer The Whitespace Analyzer can be used as an alternative to the Standard Analyzer when creating an index. Searches using the Whitespace Analyzer are case sensitive.

worker node A DPSearch virtual appliance is configured to include just the components to process metadata and content from the backup servers. Clients for Avamar and/or NetWorker run on the node. For NetWorker, the worker node must be registered on any NetWorker servers.

Elasticsearch is not included, and a second hard drive for storage is not required.

Glossary

124 Data Protection Search 1.1.x Installation and Administration Guide

Y

YaST Yet another Setup Tool, or YaST, is a Linux operating system setup and configuration tool often included with SUSE Linux. Data Protection Search uses YaST to provide the ability for administrators to setup networking. It is automati

Manualsnet FAQs

If you want to find out how the Data Protection Search Dell works, you can view and download the Dell Data Protection Search 1.1X Data Protection Deployment And Administration Guide on the Manualsnet website.

Yes, we have the Deployment And Administration Guide for Dell Data Protection Search as well as other Dell manuals. All you need to do is to use our search bar and find the user manual that you are looking for.

The Deployment And Administration Guide should include all the details that are needed to use a Dell Data Protection Search. Full manuals and user guide PDFs can be downloaded from Manualsnet.com.

The best way to navigate the Dell Data Protection Search 1.1X Data Protection Deployment And Administration Guide is by checking the Table of Contents at the top of the page where available. This allows you to navigate a manual by jumping to the section you are looking for.

This Dell Data Protection Search 1.1X Data Protection Deployment And Administration Guide consists of sections like Table of Contents, to name a few. For easier navigation, use the Table of Contents in the upper left corner.

You can download Dell Data Protection Search 1.1X Data Protection Deployment And Administration Guide free of charge simply by clicking the “download” button in the upper right corner of any manuals page. This feature allows you to download any manual in a couple of seconds and is generally in PDF format. You can also save a manual for later by adding it to your saved documents in the user profile.

To be able to print Dell Data Protection Search 1.1X Data Protection Deployment And Administration Guide, simply download the document to your computer. Once downloaded, open the PDF file and print the Dell Data Protection Search 1.1X Data Protection Deployment And Administration Guide as you would any other document. This can usually be achieved by clicking on “File” and then “Print” from the menu bar.